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FOREWORD 


The typical f ew-of-a-kind nature of NASA systems has made reliability a premium 
en on the initial items delivered in a program. Reliability defined and treated on 
: basis of percentage of items operating successfully has much less meaning than 
nen larger sample sizes are available as in military and commercial products. Relia- 
ility thus becomes based more on engineering confidence that the item will work as 
intended. The key to reliability is thus good engineering-designing reliability 
into the system and engineering to prevent degradation of the designed-in reliability 
from fabrication, testing and operation. 

The PRACTICAL RELIABILITY series of reports is addressed to the typical engineer 
to aid his comprehension of practical problems in engineering for reliability. In 
these reports the intent is to present fundamental concepts on a particular subject in 
an interesting, mainly narrative form and make the reader aware of practical problems 
in applying them. There is little emphasis on describing procedures and how to 
implement them. Thus there is liberal use of references for both background theory 
and cookbook procedures. The present coverage is limited to five subject areas: 

Vol. I. - Parameter Variation Analysis describes the techniques for treating 
the effect of system parameters on performance, reliability, and other figures- 
of-merit . 

Vol. II. - Computation considers the digital computer and where and how it can 
be used to aid various reliability tasks. 

Vol. III. - Testing describes the basic approaches to testing and emphasizes 
the practical considerations and the applications to reliability. 

Vol. IV. - Prediction presents mathematical methods and analysis approaches 
for reliability prediction and includes some methods not generally covered in 
texts and handbooks . 

Vol. V. - Parts reviews the processes and procedures required to obtain and 
apply parts which will perform their functions adequately. 

These reports were prepared by the Research Triangle Institute, Research Triangle 
Park, North Carolina 27709 under NASA Contract NASw-1448. The contract was adminis- 
tered under the technical direction of the Office of Reliability and Quality 
Assurance, NASA Headquarters, Washington, D. C. 20546 with Dr. John E. Condon, 

Director, as technical contract monitor. The contract effort was performed jointly 
by personnel from both the Statistics Research and the Engineering and Environmental 
Sciences Divisions. Dr. R. M. Burger was technical director with W. S. Thompson 
serving as project leader. 


iii 



This report is Vol. IV - Prediction. This subject has been of interest in 
reliability work since the earliest efforts of organized reliability activity. In 
these ensuing years much has been written on reliability prediction, but often the 
item concentrates on limited facets of the subject. This report synthesizes 
reliability prediction, with emphasis on the basics and the scope. C. A. Krohn 
selected and organized the contents, and together with A. C. Nelson, Jr. prepared 
the material. W. S. Thompson provided helpful comments. 
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ABSTRACT 


The features and techniques of reliability prediction are identified and brought 

together in this report. The approach is to: 

(a) Bring together scattered material, 

(b) Present some material not in books or handbooks, 

(c) Identify several points which have a tendency to be missed, 

(d) Present some ideas which may be helpful to others involved in develop- 
ment of reliability prediction techniques, and 

(e) Express some opinions related to the role of reliability prediction. 

Material presented in this report is grouped into four major categories. 

Part I is largely qualitative discussion concerned with introduction and perspective. 
Contents include discussion and opinions on the role of reliability prediction, on 
perspective features, e.g. program phase and hardware level, on the relation to other 
analyses, and on the problems. Part II is concerned with reliability measures or 
definitions concerning single items, including data sources. Part III is devoted to 
the reliability prediction techniques which are suitable for general use and to 
classical reliability models. This material is scattered throughout the references; 
the treatment here mainly identifies approaches and relates them, with 
reliance on the references. Included for multi-item models are logic, lifetime, 
environment and bound-crossing topics. The remaining Part IV is concerned with 
concepts related to the detailed treatment of failure modes without independence 
assumptions. This is food-for-thought material from the results of research on 
reliability prediction techniques. This material in Part IV, in general, is not 
suited for widespread application. The Appendix presents a ready-reference on some 
basic probability laws and on various probability distributions. 
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In this report the subject of reliability prediction is synthesized. It is 
an attempt to "see the forest", but done while keeping both feet on the ground. 
Fundamentals are stressed in order to help develop a better understanding of what is 
involved. Expanded treatment is given to basic reliability measures, to some points 
which have a tendency to be somewhat misunderstood in the literature, and to several 
topics which are not covered in existing books and handbooks but where RTI has delved 
into them. Other topics are identified and related to one-another. 

Reliability prediction as an organized discipline is approximately 15 years 
old. There are approximately a dozen books on the subject and approximately the same 
number of handbooks. There are many hundreds of reports and papers. Some which 

treat the fundamentals will be relied on heavily. 

The qualitative discussion of Part I on the scope of reliability prediction is 
suitable for any reader - design engineer, manager, reliability generalist, or 
reliability analyst. Parts II and III cover mainly conventional and classical 
approaches to reliability prediction and Part IV reports on some research on struc- 
turing certain detail into a prediction. Parts II, HI and IV will not be easy reading 
for persons who are not knowledgeable in the mathematics of probability. Of course 
perusing these parts will give any reader a flavor of the subject. If a reader 
wants to understand the subject he will have to study the material as introduced here 
and as elaborated in the references. If he does not know the mathematics of probabilit; 
then he will first need to learn its fundamentals. The practicing reliability analyst 
should be familiar with most of the contents. For him, perhaps the manner in which 
the material is organized, the identification of references, and the results of 
research on reliability prediction techniques in Part IV will be of interest. 
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Part I: Perspective 


Reliability is the moniker which has been attached to those questions concerning 
whether or not an item will perform its intended function when it is ultimately used. 
Different ways of expressing a reliability index will be described later; some of the 
more common are mean time between failure, reliability, or failure rate. Reliability 
is different from traditional performance concerning explicit quantitative require- 
ments; a reliability requirement can be avoided by just not introducing it. As long 
as the resultant item has subjectively acceptable reliability, then there is no concern. 
On the other hand, if the item turns out to be excessively failure prone, M What went 
wrong?” There was no requirement ... there was no analysis ... there was no measurement, 
even if this last task was plausible. 

The current tendency is to treat this question quantitatively to the maximum 
extent which is "sensible;" otherwise the risk of unacceptable reliability is higher 
than is necessary. When either the initial requirements for an item are being pre- 
pared or are being responded to, they will typically contain a reliability index. 
Treating reliability quantitatively brings the subject into open consideration. By 
explicitly treating reliability the designers will think about what is needed to 
fulfill the requirement. This considerably enhances the odds of getting something 
useful and on schedule. 

Even if an individual or a designer is not convinced of the need for quantita- 
tive treatment of reliability, he still cannot avoid the subject nor the need for 
some knowledge of it. When the designer is associated with the organization which 
will actually use the item, often the management pressure for a minimum total owner- 
ship cost compels quantitative reliability analysis by procedural requirements. When 
the designer is associated with an organization which is providing items to customer 
requirements, then there will usually be a quantitative reliability requirement and 
less often a specified procedure for measuring reliability. The pressure of this 
contractual requirement may be further increased if the contract is fixed price or 
has a fee incentive for measured reliability. 

In the early days of an engineering project the situation is such that treatment 
of chance is with probabilistic modeling. As the new designs evolve into physical 
items and measurements are made from testing, the situation changes into one where 
the treatment of chance is also with statistical inference. The material presented 
in this report is probabilistic. Measurement of reliability and the use of statisti- 
cal inference is given some coverage in Vol. Ill - Testing of this Report Series. 
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1, Treating Reliability Quantitatively 

This section contains some definitions and discussion of the role, uses and 
accuracy of reliability prediction. 

1.1 NASA Definition of Reliability 

Reliability is defined in NASA Reliability Publication NPC 250-1 as: the proba- 

bility that a system, subsystem, component or part will perform its intended functions 
under defined conditions at a designated time for a specified operating period [Ref. 1] . 
This definition will be used in this report. In the discussions the system, subsystem, 
component, or part will simply be referred to as a system or an item. When item 
is used, the material under discussion is potentially pertinent to any hardware level 
of aggregation. Multi-item or system will be used for bringing items together. 

1.2 Probabilistic Approach 

Reliability, in the quantitative sense as used here, is defined above as a 
probability. Perhaps another quantitative definition of reliability will evolve 
in the future which is not based on probabilistic concepts. For the present, however, 
it seems that quantitative treatment of reliability will involve probability and 
statistical inference. In one sense, this is unfortunate, as many engineers and 
managers have not had meaningful academic or other exposure to this subject. The 
subject is no more difficult than other ones of mathematics, but as with the other 
ones, it does take continued exposure to it over a period of time in order to be 
comfortable and confident with it. 

The material in this report relies on the basic probability concepts and laws 
which are briefly reviewed in the Appendix. The reader is encouraged to review them 
and, if this is new material to study the references of the Appendix or other modern 
books on probability. In particular, the plea is made to avoid what seems to be 
a tendency to pick-up a few formulas such as some from Parts II and III and to over- 
generalize their applicability. Rather, rely more on the fundamentals of the Appendix. 
To the engineers, do not be hesitant about seeking consultation from a probabilistic 
mathematician or a statistician. 

The terms probability and statistical inference were used in the preceding 
paragraph. Probability is used in reference to an a priori situation, where assumptions 
are made concerning the probability descriptions of input information. Probability 
predicts the outcome from a set of assumptions. Statistical inference is used in 
reference to an a posteriori situation, where data is used to make inferences about 
the form of the distribution and to make estimates about the parameters of the distri- 
bution. Thus, probability is deductive and statistics is inductive. 
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1-3 Role 

Reliability predictions may be performed for any of the following reasons: 

(1) Potential technical contribution, 

(2) Financial implications, and 

(3) Compulsory. 

Each of these could apply to the user (or buyer) of a system as well as to the supplier. 
The potential technical contribution is the most satisfying reason to the engineer. 

For example, he may decide to search for areas needing reliability improvement. How- 
ever, the other reasons do occur. Financial implications arise in a fixed price or 
incentive contract which also has an associated reliability requirement and method 
of measurement. The compulsory reason may typically apply to a government agency 
because of policy and to a supplier because of contract requirements. There is nothing 
derogatory about any of these reasons; each has a role in the mature blending of tech- 
nology, competition, and checks and balances. 

1.4 Uses 

Major uses of reliability prediction are: 

(1) To obtain a numerical value of a reliability index, 

(2) To obtain a numerical measure of uncertainty of the reliability prediction 

value , 

(3) To search for needed improvements in the design or the operational procedure, 

(4) To allocate total system reliability optimally to the sub-items. 

The numerical reliability prediction number and its attendant measure of uncertainty 
are usually necessary in order to respond to any of the reasons for performing a pre- 
diction which are noted above. That is, response to such questions as "Can the 
mission be achieved?" or "What are the possibilities of making a profit?" or simply 
here is what the customer asked for. Searching for reliability improvements and 
probing around for weaknesses in the design and the operational procedure is the most 
technically appealing use. It is this use that often results in a reliability pre- 
diction going into more detail than it otherwise might. That is, comparative detailed 
values are sought rather than absolute gross values. Hopefully new alternatives will 
be opened up and the really bad choices can be eliminated. Literal optimization tech- 
niques, such as dynamic programming algorithms, offer the potential of improved allo- 
cation of overall reliability among the items comprising the system. Of these uses, 
obtaining the prediction number and searching for improvements have seen more appli- 
cation than the other two. 

1-5 Accuracy 

With the extensive experience accumulated with reliability prediction, it is 
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now possible to make some intelligent judgments on accuracy even if only qualitative. 
When there is a fair amount of historical data and the equipment is not excessively 
complex or new, a crude rule-of-thumb for electronic equipment would be to expect 
the actual mean time between failure (MTBF) to be within the range of 50 to 200 percent 
of the predicted MTBF. This accuracy would apply to the case of an experienced analyst 
making his best effort, i.e., one which is not unduly optimistic or pessimistic. 

At the equipment level and the parts level, it is often possible to give the most 
accurate prediction possible with only a small amount of effort. That is, the point 
of diminishing returns is quickly reached in reliability predictions as far as the 
accuracy of the prediction number is concerned. It must be noted that the prediction 
analysis will usually go into more detail In searching for reliability improvements. 

If the inputs, the tools, and the assumptions of the reliability predictions are 
reasonably accurate and understood, then there is no reason why the results should 
not be able to be appraised so that the prediction can be intelligently used. 

The competitive nature of the buyer-seller environment quite understandably 
has an influence on the accuracy of reliability prediction. There is probably a 
tendency to get more accurate predictions, at least more conservative ones, If there 
exists a firm reliability requirement, a method of reliability measurement, and firm 
dollar implications. Those who use reliability predictions of others, e.g. those 
at higher levels of system aggregation, must realize that those at the lower levels 
will tend to present predictions which will make the supplier look best at the time 
the prediction is made. That is, the equipment supplier will often not account for 
rough handling, for unverified failures on the part of the operators, for unforeseen 
environments, or possibly for burn-in, A final remark on the accuracy of reliability 
prediction is the realization that other system characteristics such as cost, schedule, 
repair time, or even performance tend to have inaccurate predictions at the early 
stages in the life cycle. As the program progresses through the life cycle there 
is an opportunity to measure some of these characteristics, whereas reliability may 
never really be able to be accurately measured. 
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2 . 


Prediction and Allied Approaches 

In this section the classical reliability prediction techniques and those which 
are suitable for active program usage are briefly identified by key words. Also, 
selection of a particular technique and other analyses related to reliability prediction 
are briefly discussed. Parts II and III of this report will give further introduction 
to the reliability prediction techniques which are only cited here. The purpose 
is to identify reliability predictions approaches and to fit them into related analyses. 

2.1 Prediction Techniques 

Figure 2-1 shows key words associated with reliability measures of single items 
and Fig. 2-2 does the same for the conventional and classical reliability prediction 
modeling approaches for multi-item systems. The sections of this report where the 
topic is covered are cited on the figures. Broadly speaking, the single item measures 
of Fig. 2-1 can apply to various levels of system aggregation, e.g., parts, 
equipment, and system levels as well as to human events. That is, they offer indices 
by which to describe some of the inputs to a multi-item reliability prediction and 
by which to express some of the outputs. 

Reliability predictions implemented with the approaches of Figs. 2-1 and 2-2 
have typical assumptions and characteristics. Quite often these are unstated; they 
are just implied. Some typical assumptions and characteristics are: 

(1) A "fuzzy" definition of the failure of items and system, is it: Out of 

specification? Simply inoperative? Complete catastrophic failure? 

(2) The prediction usually considers each item involved to have two states, 
either good or bad. In most predictions this is reasonable; however, 
there are certain situations where this can lead to grossly incorrect 
results. A familiar example is ignoring the two failed states of open 
or short of diodes in redundant arrangements . 

(3) Independence among items is liberally assumed. Included here is the 
impact of not considering uncertainty in the natural or induced environ- 
ments . 

(4) Prediction is for a mature product. A prediction is quite often mute 
on the assumption that most design and manufacturing "goofs" have been 
removed and that the necessary burn-in period has been passed. This has 
serious implications for items such as those intended for space which 
are produced in small quantity. 

(5) Omission of the human element during operation. 

(6) Uncertainty in the parameters of single items are often not considered 
explicitly. Techniques of sensitivity analysis and of probabilistic 
treatment of uncertainty are potentially applicable. 
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Figure 2-1 Reliability Definitions for Single Items Cited in Part II 
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Of course, exceptions in particular predictions will exist for these typical assump- 
tions and characteristics. These apply to the majority, not the exceptional case. 

2.2 Prediction Technique Choice 

The approach used for a particular prediction is influenced by a myriad of 
factors. Implications of some factors such as the commodity and its intended opera- 
tional profile tend to be somewhat apparent. Here the need is mainly that of knowledge 
about the various reliability prediction indices and equations of the sort noted 
in Figs. 2-1 and 2-2. Other factors such as schedule and the extent of the intended 
reliability program are more subjective in their implications. Here the influence 
is more on the choice of parameter values rather than the choice of equations. 

There is little which can be said on applying this technique to this situation 
and that technique to that situation which is not apparent to someone who understands 
what you are talking about. About all that seems appropriate is a plea to use common 
sense, e.g. , use the simplest approach commensurate with the purpose of the analysis 
and the accuracy of the parameter data. 

Life Cycle . Reliability prediction plays its major role during the planning 
and early design phases of a program* s life cycle. In these earlier phases of the 
program a priori techniques of the sort described in this report are used. It can 
be expected that there will be many iterations of a prediction as the program pro- 
gresses, and that the prediction model will become more complex. When the program 
progresses to the point that test and operational data start to become available, 
then the a priori prediction techniques start to give way to the a posteriori tech- 
niques of statistical inference. The reliability prediction model still has a role. 

It provides a means for combining statistical inference estimates on items into a 
composite measure for multi-item levels. 

Commodity Considerations . Reliability prediction at the nonrepayable part 
level is largely a matter of selection of the appropriate form from Part II by which 
to express the reliability measure. This includes the designation of the stresses 
which are appropriate. At the equipment level, say in the order of hundreds of piece 
parts, the main consideration as to what technique is used depends largely on whether 
the equipment is electrical or non-electrical. Electronic equipment typically uses 
a very straightforward approach. The parts are assumed to have a constant failure 
rate and failure rates are added to obtain the failure rate (or its reciprocal mean- 
time-between failure) for the equipment (See Sec. 8.1.1). This approach is also 
sometimes used for nonelectrical commodities, but often more involved prediction 
techniques will be applied to mechanical or structural commodities. The stress- 
strength approaches of Secs. 5.3 and 9.2 are structurally oriented, but they are 


8 



really specialized applications of broadly applicable approaches concerning environ- 
ment effects which are noted in Secs. 4.4 and 9.1. At the system level the techniques 
which are used tend to be quite varied and the entire scope of Part III is applicable. 

Mission Implications . The operational time periods and the environments of 
the operational profile, of course, affects the choice of the reliability prediction 
technique. Space systems have been thus far of a one-shot nature with the main periods 
of launch, orbit, and recovery. Launch and recovery environments tend to be somewhat 
severe, whereas orbital environments tend to be moderate. A major constraint handicapping 
reliability measurement prior to use has been the combined effects of the lack of 
experience, the cost, and the small quantity of some commodity types. The decade 
of space experience has alleviated these conditions to some extent. However, presently 
the nature of space missions is being extended to deep space missions and eventually 
to commodities reusable after recovery. Thus an increasing number of one-shot, non- 
reusable commodities are on the verge of giving way to multiple use, repairable com- 
modities. Space reliability predictions will thus start to take on more of a similarity 
to predictions for systems intended for airborne and surface missions. These latter 
types are as much concerned with the implications of repair, that is maintainability, 
as they are with reliability. The typical airborne system which is not operating 
continuously is desired to have a very high overall availability followed by a very 
high reliability for relatively short missions. The overall availability here is 
of a continuous nature and the missions are of a cyclic nature. Systems for surface 
missions often are continuously operational, but they can quite often be removed 
from operation for repair or maintenance. However, they usually will have short 
periods of intense operation where no repair is possible. These short periods may 
be somewhat predictable, as for space-oriented services, or they may be nonpredictable 
as for military uses. Presentations on maintainability and availability prediction 
techniques are available in Refs. 2 and 3. 

Subjective Factors . The ultimate accuracy is primarily affected by the subjective 
judgment of the person performing the reliability prediction. Main considerations 
are the kind of reliability program with attendant influences of budget, schedule, 
and the operational environment. Historical experience in reliability prediction, 
particularly where it has been followed up with reliability measurements, have helped 
considerably in this area. Detailed listings have been made of the many variables 
which are pertinent [Ref. 4] but in the final analysis this is largely a matter of 
mental assimilation on the part of the person performing the prediction. 

2.3 Related Analyses 

Other types of analyses overlap and interface with reliability predictions. 
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Brief comments are given below on these allied studies. The comments are aimed pri- 
marily toward the equipment and system level of commodity complexity, and particularly 
toward the latter. System effectiveness is currently a popular phrase which is used 
to cover the scope of the considerations cited below. Some system effectiveness 
models have been proposed which attempt to pull together the appropriate ingredients 
[Refs. 5 and 6]. This is possible to some extent with the gross models and their 
attendant assumptions. A system effectiveness analysis will typically reflect the 
effort of various individuals as it is unlikely that any one individual can master 
or have the time to perform all of the analysis areas intended in any one program. 

Other Reliability Analyses . During the planning and early design program phases 
the other reliability analyses, in addition to prediction, can be classified into 
failure mode and effects, performance variation, and stress as suggested in Ref. 7. 
Failure mode and effects analyses are often probing to a level of detail which is 
not reflected in the reliability prediction model. It is often of a semi-qualitative 
nature. It Is conceptually possible to reflect extremely detailed failure modes 
into reliability prediction.* However, there are usually the practical reasons of 
the unavailability of data and the complexity of such models which prevent a literal 
one-to-one correspondence between the failure mode and effects analysis and a reliability 
prediction. 

The performance variation analysis is concerned with the area of reliability 
prediction which in this report is referred to as bound- crossing . The reliability 
discipline has promoted approaches for drift failure analysis of electronic circuits 
which are commonly referred to as worst-case or as tolerance analysis techniques. 

These have proved to be of value for purposes of reliability improvement. However, 
they almost invariably are not extrapolated over into the reliability prediction 
analysis. Again this is conceptually possible but usually not done for sound reasons.* 

It should be noted that with mechanical and structural commodities there has been 
greater use made of bound-crossing techniques for reliability prediction purposes. 

A prominent example here would be the classical stress-strength model. Those which 
are conventional or classical are noted in Parts II and III of this report. 

Stress analysis typically has the most explicit relationship with a reliability 
prediction. This is because many of the reliability prediction manuals include the 
applicable stress derating and failure rate adjustment tables and curves. Examples 
are the effect of temperature, current or wattage levels on the failure rate. In 
the nonelectronic commodity the stress-strength model would be an example of a technique 
which is common to stress and prediction analysis. 


* Part IV of this report presents some thoughts on structuring a detailed reliability 
prediction model which explicitly incorporates this detail. 


10 



Conventional Design Analyses . These traditionally involve both performance and 
stress calculations and are what the design engineer would do to some extent regard- 
less of whether he is explicitly concerned with reliability analysis. The performance 
analysis is mainly related to the bound-crossing type of reliability measure and to the 
performance variation analysis. The traditional deterministic, engineering equations, 
relating performance attributes to part characteristics and other variables, become 
part of the performance variation analysis. Similarly, the traditional deterministic 
stress equations are developed and used by the designers. Calculation of safety margins 
to such factors as voltage, power and temperature is a familiar form of this type of 
stress investigation. 

Safety . Systems analyses for manned space missions have always been directed 
toward both safety and reliability. In terms of the impact on the reliability predic- 
tion model, it usually turns out that the same, or slightly modified, prediction approach 
will serve the safety prediction needs as well as those of reliability. For safety 
there will typically be a different criterion of failure and a different operational 
profile than for reliability. Also note that in nonspace types of systems, safety 
analyses are also being performed [Ref. 8], 

Availability and Maintainability . When repair is possible during application, 
then availability and maintainability cannot be avoided in the prediction. This adds 
a measure of complexity to the prediction technique, as the reliability prediction 
literally becomes absorbed by the availability prediction. Some comments have pre- 
viously been made on these analyses in Sec. 2.2 under the heading of Mission Implications. 

Spares . Reliability prediction techniques have been experiencing increased 
applications in spares planning and optimization. These may seem to be inseparable; 
nevertheless, reliability analysis and spares analysis have been traditionally performed 
by separate groups. Furthermore there are reasons which from the spares viewpoint 
cause items to have higher failure rates than from the operational viewpoint. Examples 
here would be the effects of secondary failures and the replacement of incorrectly 
diagnosed failures. It is also noted that optimum spares allocation and optimum 
redundancy allocations can use identical approaches. 

Cost Trade-off. If cost-reliability relationships are available for single items, 
then for some forms of multi-item configurations the literal optimization techniques 
can be applied in order to obtain optimum reliability values for items. Also, to some 
limited extent this can be expanded to include simultaneous optimization of reliability 
and spares or reliability and maintainability. The main limitations here are the 
accuracy of the cost-reliability or cost-maintainability relationships and those of 
optimization techniques. Note also that the optimum allocation techniques find 
application for other penalties than cost, e.g., volume, weight, power, or perhaps 
simultaneous treatment for several of these. 
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The basic reliability allocation problem which is amenable to analytical solution 
is that of selecting an optimum system configuration from allowable alternate design 
approaches so that reliability is maximized subject to a penalty constraint, or vice- 
versa. It is necessary to have a reliability prediction equation which covers the 
range of allowable alternate design approaches and similarly a penalty prediction 
equation. Thus one use of the suitable reliability prediction equations of this 
report is to provide an input for a reliability allocation. An approach to this 
problem can be developed based on the dynamic programming principle. As would be 
expected exact solutions can be obtained from problems which are usually too simple 
to be of practical value. For example, Ref. 9 gives a dynamic programming procedure 
for selecting exactly the order of the active redundancy in the case of one constraint 
and of the active form of redundancy. Procedures for more realistic problems can be 
developed but they usually yield an approximate solution. However, the incompleteness 
usually will not result in differences of practical importance. 

Refs. 10 and 11 describe computerized approaches which are suitable for realistic 
problems. The approach in Ref. 10 Is for identifying an optimum redundancy configura- 
tion where each item in the system can be active, standby with switch, or spare 
redundancy. It is assumed that only one item must work, that the items have an 
exponential failure distribution, and that the failure (or success) events for the items 
are mutually Independent. Ref. 11 treats essentially the same problem ignoring the 
switch but introducing the non-serial, e.g., a "bridge," configuration. The former 
paper is patterned after the results in Ref. 10 but allows for more practical 
redundancy alternatives. 

It was decided that the result given in Ref. 10 could be generalized to include 

the case in which at least n items must work out of n items (n < n) . In order to 

o o 

do this it was necessary to derive a general reliability prediction formula for 
parallel arrangements, as shown in Sec. 8.4. This formula has been computerized and 
the program is discussed in Volume II - Computation. This program Is actually a sub- 
routine in the general Reliability Cost Trade-off Program (RECTA). The subroutine 
enables one to consider majority voting redundancy as well as the three types of 
redundant items as noted above. Practical procedures for obtaining an optimum selec- 
tion of the reliability of items in series can also be based on a dynamic programming 
procedure. This is where reliability improvement of an item is improved by such means 
as design and manufacturing emphasis on reliability and redundancy is not allowed. 

The largest difficulty here is obtaining an accurate relationship between item relia- 
bility and cost. The general reliability cost trade-off program (RECTA) cited above 
simultaneously treats configurations containing series and the various redundancy 
approaches. Here the allowable alternative for an item includes increasing the 
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reliability of the non-redundant item and/or making the item redundant. Any of these 
alternatives can be disallowed, thus a generalized series-parallel reliability alloca- 
tion procedure. 

RECTA as cited above was developed as part of an evaluation of computer 
programs for system effectiveness [Ref. 12] . This reference and other sources will 
call attention to the possible use of allocation procedures based on linear and 
quadratic programming and on Lagrange multipliers. These approaches have usually not 
proven suitable for realistic reliability-cost allocation problems. 
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3. Needs and Problems 

The largest need is that reliability prediction be included or be considered 
as an essential element of the actual decision-making process. This is not just 
a matter of design engineers and managers tolerating the reliability prediction, 
but rather one where the desired situation is that these persons need and want the 
results of the prediction. The reliability prediction should be influencing the 
design and operating plans, rather than a separate exercise whose outputs are ignored 
or forced to justify a preconceived design and operating plan. The problems here 
are grouped into those concerning people, data, and techniques. These remarks are 
not in the sense of criticizing anyone or any discipline; rather, they are intended 
as unemotional commentary. 

3.1 People 

Reliability prediction utilizes heavily the mathematics of probability. Reli- 
ability measurement and testing utilize the mathematics of statistical inference. 

These are both complex subjects that are simply difficult to really learn. In addi- 
tion to the practical knowledge required for applying them, the theory is also important. 
The majority of technical persons, including designers, management, as well as reliability 
engineers, typically have not had the opportunity to become well-versed in the mathematics 
of probability and statistical inference prior to their initial attempts at using 
them. 

The solution here is not at all readily apparent. A probability or statistics 
course or two in the college curriculum or a concentrated short course after college 
really only helps the person communicate better with someone who is well-versed in 
these subjects. Persons specifically trained in the mathematics of probability or 
statistics, on the other hand, have their difficulties in understanding the engineering 
applications. Such a person in a product-oriented organization will typically have 
difficulty adjusting to the approximate nature of engineering mathematical models, to 
the myriad of pertinent variables which cannot be reflected simultaneously in equations, 
and to the situation that testing to satisfy statistical confidence often requires 
unrealistically large sample sizes due to cost considerations. 

There is some sentiment for having the design engineer also pick up the task 
of reliability prediction and the other reliability analyses. There is much to be 
said for this; after all, these persons, particularly at the equipment level, are 
usually called upon to provide cost, weight, and other predictions in addition to 
strict performance. It is generally accepted that the designer is "responsible" 
for "designing reliability" into the equipment; it follows that he should have some 
degree of responsibility for the reliability analysis of his design. For electronic 
equipment this may be a reasonable approach. Some few suppliers are doing this. 
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They do not have reliability specialists, or if they do he performs the role of a 
consultant. For structural and mechanical commodities and for systems, the reliability 
prediction is more complex than for electronic equipment. The approach of having 
the designer also perform the reliability prediction is more difficult here. Even 
if management decides that the approach of having the designer perform the prediction 
is desirable, it is still difficult to implement. These design people are already 
generally overloaded in work schedules. Also, they may not be interested. 

A nagging consideration to many persons is that the mathematics of probability 
and statistics have enjoyed successful application in many areas, for example, communi- 
cations, economics, biology, agriculture, and information theory. It seems that it 
is the reliabillity area which perhaps uniquely has a somewhat unsuccessful history of 
application of probability and statistics. At least the road here has been a lot 
rougher than in other areas. One cannot help but feel that a major reason is that 
many of the people who have been involved in reliability prediction - the people doing 
them as well as other persons who are expected to use the results - have just been 
weak in the theory and practical applications of probability and statistics, 

3.2 Data 

Data refers to the actual numerical value of reliability indices for various 
items. Thus data, one way or another, revert back to some type of reliability 
measurement. Even once the need is recognized, there is the problem of how to go 
about making reliability measurements. What is the best index? The greater the 
reliability of any item, the more difficult it is to measure. Who is to pay for it? 

A part or equipment supplier often will deliver his product and will never hear any- 
thing further regarding reliability, particularly if it is satisfactory. Experience 
with the reliability measurements of operational items have indicated that it is near 
impossible to rely on operational and maintenance personnel to supply this data; 
special persons have to go along just to record the reliability information. In 
addition, there is recognition of the situation that it is more glamorous to work 
with models and equations than to try to record and interpret data. 

Efforts, of course, have been made at gathering and disseminating data, and these 
continue (Sec. 6 of this report contains some data references). These are to be 
commended. Contractors are more and more developing their own data and making data 
banks, but they are handicapped. More efforts are needed in the data area and of 
necessity will require funding by various government agencies. Individual contractors 
do not have wide access to operational sites, nor do they have much funding for this. 
With regard to individual programs, there is a great opportunity for using sampling 
techniques rather than to record everything, particularly with actual operating equip- 
ment and systems, in order to gather needed information. 
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3 . 3 Techniques 

A few comments are given here on the area of reliability prediction technique 
needs; however, this is not as large a problem area as those of people and data. Con- 
flicting positions can be easily taken. On one hand, it can be said that more complex 
techniques are not generally going to be applied because invariably better data is 
needed. It is unlikely that such data will become generally available. On the other 
hand, it can be argued that complex situations require complex mathematical models. 

In any case, efforts will continue for technique development. It is something that 
can be done individually and without . major funding. It is the sort of thing that 
people who are inclined in this direction will continue to do whether they have a great 
deal of- support or not. 

Computers seems to continue to get faster with larger storage. This opens the 
door to more involved and more complex analyses. Rationale-wise, there is a need to 
cycle more practical experience back into the development of prediction techniques. 

This is now becoming possible more than previously because of the increased experience 
with reliability prediction. 

At the system reliability level, opportunity areas are more explicitly bringing 
in the human impact and the environment, that is, treating the reliability of man as 
well as the machine and treating other unknowns such as possibly the environment as 
a probabilistic variable. At the equipment level a need is how to formulate proba- 
bilistic models for treating distinct failure modes simultaneously with environment 
(Part IV of this report presents some thoughts on this). At the systems level again, 
there is a need for improved methods to tie together maintainability, spares, per- 
formance, and cost with reliability. This has been labeled systems effectiveness, and 
there are efforts under way here as noted in Sec. 2.2. 

Also to be given due consideration is the opportunity for less complex methods, 
that is, striving for balance between complexity of the prediction technique and 
accuracy of the result. There are places for simple rules of thumb and for simple 
estimating relationships. 
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Part II. Single Item Reliability 


In this part the concept of reliability measures for a single item are discussed 
from a broad viewpoint. The reliability measures consider two basic categories of 
problems: (1) those in which an item is in either a success or in a failed state 

(considered in Sec. A) and (2) those in which certain characteristics of an item 
may be of an unacceptable value, the "bound-crossing" problem (considered in Sec. 5). 
Guidance on obtaining numerical index values for a single item of both categories is 
given in Sec. 6. 

These reliability measures are potentially applicable to any item or event to 
be considered in a prediction. Thus inputs for multi-item prediction equations would 
be of one of the forms covered, as would the output of the prediction. Or, if the 
reliability measures for an item is obtained from testing, then inferences would be 
made concerning these measures. 

These definitions will, in the main, be well-known to reliability workers. 

Some features are covered, however, which are not emphasized in existing handbooks 
and books. These are the following: Possible confusion concerning mathematical 

descriptions of the widely cited bathtub curve when it is used for non-repairable 
items such as parts versus for repairable items such as equipment is discussed in 
Secs. 4.3 and 4.8. Uncertainty in the environment is discussed in Sec. 4.4. 

Explicitly bringing time into consideration for bound-crossing problems is introduced 
in Sec. 3.4, where possible failure criteria for non-monotonic drift requires 
careful treatment. 
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4. 


Reliability Measures 

Various indices used for reliability measures are described in this section, 
and there is a probing beyond conventional assumptions. The material gets progres- 
sively more involved, starting with simpler notions and models. The later part of 
this section goes into considerations of reliability measures for repaired items. 

4.1 Definitions of States and Reliability 

The simplest way of classifying the state of an item is as two states, 
success (S) and failure (F) . Let P(S) be the probability of success and P(F) be the 
probability of failure of the item subject to given conditions under which the 
probability measures are to be defined. Then 

R = P(S) = probability of success, 

1 - R = P (F) = probability of failure, and clearly 

P(S) + P (F) = 1. 

This simple classification and the associated indices of reliability and unreliability 
are based on several assumptions such as the following: 

(1) a definition of failure exists, 

(2) the probabilities of success (or failure) are conditional on a known 
(deterministic environment, or on known characteristics of environment 
described by probabilistic measures, and 

(3) the classification is for a certain future time instant or time interval. 
Much of the subsequent material in this section involves expanded treatment of these 
assumptions. The assumptions should be kept in mind, but more important, they should 
also be kept in perspective. Most definitions and mathematical models are based on 
assumptions which are not fully met when associated with real world situations. The 
delicate question is always one of the effects of violation or relaxation of the 
assumptions for the problem which is at hand. Sometimes extremely simple equations 
will do the job; at other times extremely complex equations are needed. In some 
situations the state of an item should be subdivided into three states S, F^ , and F^ 
for an adequate approximation to real world application. F^ and F^ are two different 
failure modes and the probability identity can be written as 

P (S) + P(F X ) + P(F 2 ) = 1. 

Some examples for consideration of two failure modes are digital circuits, relays, 
switches, and diodes. In general, any reasonable number of states may be associated 
with the various modes which an item might assume. Some additional comments concerning 
mathematical descriptions of multiple failure modes appear in Sec. 4.3. 
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It is desired to broaden one's concept of failure to include the many possible 
types which may occur. Some examples of failure modes are given below, 

(1) The performance of an item deviates from its nominal value by more 
than 10 percent. 

(2) A diode opens or shorts. 

(3) An amplifier is "noisy". 

(4) An accumulation of the effects of a somewhat periodic variation of the 
performance of an item outside given bounds." 

(5) Corrosion of a boiler tube. 

(6) Fracture of a pressure vessel. 

These various types of failure are introduced to motivate one to pay attention to 
possible ways in which Items can fail and hence not overlook any important details. 

4.2 Reliability as Function of Time 

The probability density function of time to failure of an item will be used as 
the starting point, as this can be visualized easily from a histogram of time to 
failure data. In Fig. 4-1 a histogram is shown as dashed and the associated 
probability density is the continuous function. 

(1) The probability density of failure as a function of time t is 

p(t) , t >_ 0. (4-1) 

(2) The probability of failure of the item by time t is the cumulative 
probability 

F (t) = Jp(t)dt. (4-2) 

0 

(3) Reliability is the probability of no failure by time t 

00 

R(t) = 1 - F(t) = /p(t)dt. (4-3) 

t 

(4) The hazard rate is the conditional probability of failure given 
that the item has not failed by time t. Other terms widely 
used for hazard rate are failure rate (when exponential failure 
density function applies), instantaneous failure rate, or force 
of mortality. 

The probability relationship concerning two dependent events can be used to develop 

a 

the hazard rate. Recall that 


P(A| B) 


P(AB) 
P (B) 


Basic probability definitions and relationships are presented in Appendix A. 4. 
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If: P(A|B) 

= h(t)dt 

p(ab) 

= p(t) dt 

P (B) 

= R(t) 


probability that an item fails between t and t+dt, given 
that it has not failed by t, 

probability that an item has not failed by t and that it 

fails between t and t+dt, and 

probability that an item has not failed by t. 


Hence 


p(a|b) 


P(AB) p(t)dt 

P (B) R(t) ’ 


h(t)dt 


p(t)dt 

R(t) 


or h(t) 


R(t) ‘ 


(4-4) 


The hazard rate function h(t) can also be obtained using the fact that it is an 
instantaneous failure rate. 


h(t) 


lim F(t+At) - F(t) 1 = p(t) 

At-M) At R(t) R(t) ' 


It can also be expressed as follows: 


h(t) 


-R'Ct.) 

R(t) 


din R(t) 
dt 


where R T (t) is dR/dt. Reliability can now be expressed as 

t 

R(t) = exp{-/h(t)dt}. 

0 


(5) The mean time to failure, MTTF, is the expected time to failure. 
The expected value of a random continuous variable x is 


00 

E(x) = /x p(x)dx 

— oo 


or in the above notation 


E (t) = MTTF = ft p (t)dt = /R(t)dt . (4-5) 

0 0 


The last result can be seen by integrating by parts the following 


/ 1 R'(t)dt, R’(t) = . 

0 dt 

The definitions in Eqs . 4-1 through 4-5 were developed for time as a 
continuous variable. In some situations it is appropriate to measure time as a 
discrete variable, where the number of cycles or operations to failure is a discrete 
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variable. The definitions in Eqs . 4-1 through 4-5 have direct counterparts for handling 
discrete variables. These counterparts for the discrete variable case are shown below, 
where n is the number of cycles to failure [Ref. 13]. 


Probability density: 

p(n), n 

= 1. 2, 3, ... 

(4-6) 

Probability of failure: 

F (n) = 

n 

l p (n) 

(4-7) 

Reliability: 

R(n+1) 

= 1 - F (n) 

(4-8) 

Hazard rate: 

h(n) = 

P( n > 

R(n-l) 

(4-9) 

Mean cycles to failure: 

MCTF = 

| n p (n) . 

(4-10) 


A large number of possible probability density functions (discrete and 
continuous forms) have been proposed. Several are shown in Appendices A.l and A. 2. 
Although these density functions are presented with reference to lifetimes, there are 
also other possible applications of these same density functions in reliability 
analysis. Some of these density functions will again appear in subsequent sections 
of this volume. See Ref. 14 for some good examples of application of various 
density functions for reliability purposes. 

The exponential density function is widely used in reliability prediction and 
its key feature, a constant hazard rate, is illustrated below in Ex. 4-1. One of 
the most common misconceptions appearing in the reliability literature is the 
implication that a random failure law and the exponential failure law are one and 
the same. Assuming a random failure law simply implies that failure times occur 
randomly over time according to the stated probability distribution, there can be 
any number of distributions or laws depending upon whether the log— normal, the 
Weibull, the gamma or some other distribution is assumed to best describe the 
distribution of failure times. 


Example 4-1 

A high-power magnetron has an exponential distribution of time to 
failure with a failure rate of 2.5 x 10“ 3 failures per hour. The probabil- 
ity density function of Fig. 4-1 is of this item. (a) What is the relia- 
bility for a new magnetron for the first 40 hours of operation? (b) What 
is the reliability for the following 40 hours of operation if the 
magnetron has not failed during the first 40 hours? 
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Solution : 

(a) The probability density function (pdf) of the exponential distribution is 


p(t) = Xe 


Using Eq. 4-3, the reliability equation for the exponential distribution is 

00 

R(t) = j\e~ Xt dt = e _Xt . 
t 

For X = 2.5 x 10“ 3 , t = 40 hours the reliability is 


n -2.5 x10“ 3 x40 

R = e 


0.905. 


(b) Rephrasing the second question, what is the probability that failure will not 
occur in an interval At = t" - t', given that it has not failed up to time t 1 ? 

Using the probability that an item failed between t and t+dt if it has not failed 
by t shown in development of the Eq. 4-3: 



Figure 4-1 Exponential Probability Density 
With \ = 2.5 x 10~ 3 and a 400 Hour MTTF 
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R 


1 


t" 

/ p(t)dt 

t; 

R(t’) 


. _ R(t') - R(t") 
1 R(t’) 


R(t") 
R(t') * 


In the example problem for the exponential distribution 


R 


-At" -A(t T +At) 

e _ e 

-At ! " -At’ 

e e 


-AAt 

e 


. e -2-5»10-’*40 . 0 . 905 . 


Thus the same solution applies to questions (a) and (b) . Let us now apply Eq . 4-4 
for the hazard rate to the exponential distribution to assist in understanding this 
result . 


h(t) 


= Ae 

R(t) -At 

e 


A. 


The hazard rate for the exponential distribution is constant. For the exponential 
distribution the same reliability equation applies regardless of how much operating 
time has been accumulated. Only the exponential distribution is like this, which 
is one reason why it is widely used in reliability analysis. 

4.3 Bathtub Curve 

A form of the hazard rate which is widely cited in reliability literature is 
the bathtub curve as shown in Fig. 4-2 (a). A popular reasoning on how such a curve 
would come about is as follows. The early decreasing hazard rate is thought of as 
resulting from manufacturing defects, and early operation will remove these items 
from a population of like items. The remaining items have a constant hazard rate 
for some extended period of time where the failure cause is not readily apparent and 
finally those items remaining reach a wear-out stage. There is a strong parallel 
between the above curve and the instant mortality curve for human beings. 

None of the commonly used reliability distributions such as those cited in 
Sec. 4.2 and expanded on in Appendix A.l, e.g. log-normal or Weibull, individually 
has a form which has this bathtub shaped hazard function. Thus if a mathematical 
description of the bathtub curve is desired then it must be developed. One approach 
would be to first select an appropriate probability density for each of the three 
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periods of decreasing, constant, and increasing hazard rates as shown in Fig. 4-2 (b). 
These will respectively be p d (t), and p^(t), and p^t). These could each be for the 
Weibull or gamma distribution with different shape and location parameters for each 
of the three periods. The p^(t) for the constant hazard-rate will be the exponential 
distribution, which is one case of both the Weibull and gamma. Further, there is a 
probability that only one of the failure causes will occur for an item, where 
P(d), P(c) , and P(i) are respectively these probabilities for each of the three causes 
and P(d) + P(c) + P(i) = 1. These probabilities for a single item will be the same 
as the percentages for a large population of these failed items which would fail 
from each of the causes. A probability density for an item such as that shown in 
Fig. 4-2 (b) could be developed from 

p (t) = P(d) p d (t> + P(c) P c (t) + P(i) PjL (t). (4-11) 

where the terms are discussed above. The reliability function and hazard rate can 
then be developed using Eq. 4-3 and 4-4. 

Another approach to the development of a reliability function for the bathtub 
shaped hazard curve is to treat the reliability of each of the causes as conditional 
events*. Here the probability that an item will not fail as a function of time is 

R(t) = R(d;t) R(7;t|d) R(7;t|d,0. (4-12) 

where d is the event of no failure from the cause described with a decreasing hazard 
rate and similarly for c and i. Development of this function will lead to the same 
results as development of Eq. 4-11. 

There are two reasons for this discussion. One is that the development of 
Eqs. 4-11 and 4-12 illustrates how the same item would be mathematically described 
where time and multiple failure modes (or failure states of Sec. 4.1) are both 
explicitly considered. This approach will be used later in Part IV where detailed 
failure-modes are again treated. Another reason for development of mathematical 
models which would have a bathtub shaped hazard function is to assist in the under- 
standing of the implications of these curves. 

The above discussion is for a bathtub shaped curve for an item or for a popula- 
tion of identical items where a failed item is not repaired or replaced. The bath- 

tub shaped curve is also used in association with the situation where an item is 
repaired or replaced, in particular for repairable equipment. When failed items are 
repaired or replaced, then the mathematical development of the bathtub curve is 
different than noted above. This use is discussed in Sec. 4.8 following the presenta- 
tion of some groundwork in Secs. 4.5 through 4.7 concerning reliability measures of 
repaired and replaced items. 
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(a) 


Hazard-rate 

h(t) 



(b) 


Prob ability 
density 
p(t> 



Figure 4-2 The Bathtub Shaped Hazard-Rate Curve 
and Its Probability Density 


4.4 Consideration of the Environment 

The reliability of an item is defined as the probability that an item performs 
its intended function under defined conditions at a designated time for a specified 
operating period. Thus the reliability is conditional on a specific environment or 
environmental profile whether it is estimated by a simulated test or from results of 
items used in previous missions. The environment might be characterized by fixed 
conditions, such as temperature equal to 30°C, or it may be described by a deterministic 
profile, such as that shown in Fig. 4-3. 


T 



Time 


Figure 4-3 Example of Deterministic Environmental Profile 

The environment might also be characterized by a random variable or a random 
process where time is explicitly considered. Some approaches to considering the 
effect of random environments on reliability measures are discussed below. 

If the environmental stress is described by its density function p(E) , then 
the probability of successful operation is given by the following procedure. Let 
the conditional probability of success given E be denoted by 

p(s|e), 

then the unconditional probability of success for continuous density function, p(E), 
is given by 


P(S) = /p(s|e) p(E)dE 

E 


(4-14) 
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and for discrete density function, P(E^), i = 1, 2, . . . , by 

P(S) = Z [P (S I E ± ) P(E 1 )]. 


(4-15) 


Example 4-2 

Consider the simple example in which the probability density 
for the environment is discrete as given below. 

I T for E, 

4 1 

\ for E 2 , 

r for E- 
4 3 

and let the probability of success conditional on these environments be 
P(5|Ei) = | 

p(s|e 2 ) = | 

PCs | e 3 ) = | . 

Then the unconditional probability of failure is 

Z tP<s|E ± ) P( Ei )] = \ x | + i X | . 

The above concept also can be used when an event requires an elapsed time period 
(such as S = no failure to time t) and also when the environments are time 
dependent , 

In some situations it is necessary to explicitly consider the environment as 
a random process with known characteristics. Consider the problem where an item 
will sometimes fail when an environment which is a random process reaches a certain 
level. If the environment is a random process with peaks the distance between which 
is given by the negative exponential distribution (assuming they occur with rate A 
per unit time period) and if the conditional probability of failure is p given that 
a peak has occurred, the probability that the item does not fail in the interval 
(0, t) is given by 
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P(s) 


P(no peaks in (0, t)) + P(1 peak in (0, t))q 


+ P(2 peaks in (0, t)) q 2 + ... 


-Xt . (Xt) e 
= e + jj- 


-Xt 


, . (xt) e xt 2 

q + — 2i q + • • • 


-Xpt 


Thus the failure time distribution is exponential under this environment. See 
Refs. 15 and 16 for further discussion on this and related descriptions of a random 
environment. Further, if an item will fail only after k peaks or shocks have occurred, 
the gamma density function is appropriate. That is 


p k (t) 


,k k-1 -Xt 
X t e 

T(k) 


t > 0 


where 


P k (t) = °’ 


elsewhere , 


t is time, 

X is the rate at which the shocks occur, 
k is the number of shocks for failure, and 
T(k) = (k-1)! = (k-1) (k-2) ••• 1. 


(4-16) 


In summary, the nature of the environment must be considered carefully to 
hypothesize models for behavior of the reliability function. 


4.5 Poisson Processes 

The Poisson process is widely assumed in reliability prediction, particularly 
for repairable items such as the typical electronic equipment. 

Let ? n (t) = probability that exactly n occurrences are recorded during a 

time interval of length t. 

Thus Pq(i) ■ probability of no occurrences, and 
1 - Pg(t) = the probability of one or more occurrences. 

It is assumed that 

i-V*) 

lim X, that is the probability of one 

t-K) Z 

or more occurrences is proportional to the length of the interval, X is a positive 
constant, the failure rate of an item. See Ref. 17 for a detailed development of 
this process and related birth and death processes. 
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Postulates : Whatever the number of occurrences in the interval (0, t) , the following 

conditional probabilities hold 


P(an occurrence in the interval (t, t+h)) = Xh+o(h) , 


P(more than one occurrence in (t, t+h)) = o(h) 


The above postulates yield the following difference equations. 

P (t+h) = P (t) (1-Xh) + P -(t)Ah + 0(h), n > 1 (4-17) 

n n n-i — 

i.e. , the probability that there are n occurrences in the interval (0,t+h) is the 
probability of n occurrences in the interval (0,t) multiplied by the probability 
of no occurrences in the interval (t,t+h), plus the probability of n-1 occurrences 
in the interval (0,t) and one occurrence in the interval (t,t+h), plus the probability 
of n-x (x >_ 2) occurrences in (0,t) and x(>^ 2) in the interval (t,t+h), (the latter 
is of order o(h)). For n = 0 

P Q (t+h) = P Q ( t) ( 1-Xh) 


P Q (t+h) - P Q (t) 


-XP Q ( t) , 


and as h -+ 0 one obtains 


dP Q (t) 


-XP Q (t) or Pg(t) = ”XPq ( t) . 


Using Pq ( 0) = 1 we get P Q (t) = e . Equation 4-17 similarly can be reduced to 
the differential equation 


P f (t) = -XP (t) + XP - (t) , n > 1. 

n n n-i 


(4-18) 


Substituting into Eq. 4-18, we obtain 


P 1 (t) = Ate Xt . 


We derive successively all the terms to obtain the general terms 


-At n , n 

P n (t) = 6 nl > n = °» !> 2 > •••» “• 


(4-19) 


o(h) is a function of h such that lim 

h^-0 
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This formula gives the probability that n occurrences will be observed in a time 
interval (0,t) with a constant rate of occurrence per unit time equal to X. The 
quantity Xt is the expected number of occurrences in the time interval of length t 
and one frequently sees the form 


V'> 


-p n 
e 

n! 


(4-20) 


where p is the expected number of occurrences in the time interval of length t. 


Example 4-3 

Suppose that an item has a failure rate, X, of 0.001/hour. What 
is the probability that no failures occur in 100 hours? 

Solution: 


Xt = 100 (.001) = 0.1. 


Thus 

P(0 failures) = e = e ^ ^ . 

Note that this is equivalent to the reliability of the item. Hence one can better 
understand the tie-in between the Poisson distribution and the exponential distribution 


Example 4-4 

Suppose that a certain item is tested as follows. One item is 
placed on test until failure and it is then replaced by another identical 
item, etc. Suppose further that the failure rate of the item is 0.001/ 
hour and that the test is for 10,000 hours. What is the probability of 
at least 15 failures? 

Solution : 

At is equal to 0.001 (10**) * 10, the expected number of failures in 10 4 hours. 
Thus the probability of at least 15 failures is given by 


P (n > 15) 


-At v n 
y e (At) 

n=15 n! 


n=15 


-10 
e 

n! 


10 


0.083 


using Molina*s Tables [Ref. 18 ] for the Poisson distribution. The same solution 
would apply to the above problem if a single item was repaired and returned to 
operation. Here the operating times between failure would be exponentially dis- 
tributed, and the equipment reliability index, Mean Time Between Failure (MTBF) , is 
the mean of this distribution. Only the operating time would be considered. Further, 
the same solution would apply to any number of identical items operated for a total 
time of 10,000 hours, regardless of how much time was accumulated on any item. 
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A reason why the Poisson process is widely assumed in reliability prediction 
is that in this mathematical model past operation has no influence on future relia- 
bility. This simplifies a prediction analysis; for some complex systems it makes 
the prediction practical. 

4.6 Reliability Measures for Repaired Items 

Reliability descriptions for repairable items are discussed here for a 
general situation where an example of such an item would be a motor or typical 
electronic equipment. With repair, there are time (of operation) to first failure, 
time between first and second failure, time between second and third failure, and 
so on. Each of these failure times when considered for a large number of identical 
items will have a distribution associated with it; these distributions may or may not 
be identical. 

The data from motor failures [Ref. 19] have indicated time between failure 
patterns as in Fig. 4-4. Density functions of the time to first failure, time 
between first and second failure, time between second and third failure, and so on 
are shown in Fig. 4-4, and these could be fitted with Weibull distributions with 
different shape and scale parameters. The origin of each density function is when 
operation is resumed after the motor is repaired. When the density functions are 
plotted on an elapsed operating time scale, starting with the earliest initial 
operation, then only the time to first failure density function is as shown in 
Fig. 4-4 and the others have a different shape. This is illustrated in Fig. 4-5. 

The density function of the time to second failure on the scale in Fig. 4-5 is the 
sum of the time to first and time between first and second failure; the time to 
third failure is the sum of the first, second and third, and so on. There is 
considerable overlap on the time scale of Fig. 4-5; as an example the early overlap 

of the first and second times comes about because some of the first failures occur 

late, which are repaired, and the second failure occurs shortly. The overlap 
reflects the many possible combinations by which the first and second failures can 
occur. As the third, fourth, and additional failures are brought into consideration, 
they enter into the overlap on the elapsed operating time scale in a similar manner. 
The summing operation is referred to as convolution. Fig. 4-5 also illustrates the 
renewal rate, which represents the total number of items failing per unit of time, 
divided by the original population. It can be seen to be the sums of the ordinates 

of all the density functions of the time to failure as a continuous function of time. 

Note that this is a conventional deterministic, algebraic summing, and is thus dif- 
ferent from the probabilistic convolution type summing noted above. The renewal 
rate where an item is repaired is similar in one sense to the density function of the 
non-repairable item, as their shapes are what the smoothed curves for histograms 
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0 t 

Figure 4-5 Renewal Rate Associated Witli Fig. 4 4 
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of populations of items would look like. However, they are different in the sense 
of predicting the reliability for a single item. If for a repairable item the 
accumulated operating time since the last failure, and the shape of the density 
function of the pending failure is known, then a reliability prediction equation 
would be based on Eq. 4-4. If this information is not known then how a reliability 
prediction would be made would depend on just how much is known concerning accumulated 
operating time, accumulated failures, and the time of last failure. 

The mathematical description of the renewal rate is sketched below. Extensive 
treatment of it can be found in Ref. 20. 


u i(t) = p-^t) 

t 

u 2 (t) = /^(tp p 2 (t-t 1 )dt 1 

u i ( t) = /Vl <t l-1 ) P i (t_t i-l )dt i-l 

t 

u (t) = / u (t ) p (t-t )dt . 

n q n-1 n-1 n n-1 n-1 

The renewal rate is their sum: 


(4-21) 


Here 


r(t) ’iJiVV' 


(4-22) 


P ± (t) 


the density function of the time between the 
(i-l)tji and ith failure where elapsed time 
only includes that of the i th failure. 


u.(t) 


the density function of the time to the ith 
failure, where elapsed time includes that 
of previous failures. 


r(t) = the renewal rate where elapsed time is 
continuous . 


The renewal rate has not received much explicit application to conventional relia- 
bility predictions, as conventional predictions typically assume that time between 
all failures during the period of interest have an exponential distribution and 
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thus are a Poisson process as discussed in Sec, 4.5. The Poisson process was 
developed in Sec. 4.5 using difference equations, but it could also be developed 
using the renewal rate as the basis. However, for various mixtures of non-exponential 
distributions where the difference equation approach is not applicable, the renewal 
rate offers an approach for developing appropriate mathematical models. The discus- 
sion of renewal rates is included here to give those interested in using non-exponential 
distributions for repairable items an indication of how to get started and also to 
support the later discussion in Sec. 4.8 concerned with the use of bathtub shaped 
curves for repairable items. 

4.7 Reliability Measures for Replaced Items 

A somewhat similar situation to the repairable item exists where Identical, 
non-repairable items are used in large quantities and are replaced with new items 
when a failure occurs. Examples of this are light bulbs of fluorescent tubes in 
large buildings. Here the mathematical description of the density funntions of the 
time to first failure, time between second add third failure, and so on are identical. 

The renewal rate of Sec. 4,6 becomes the replacement rate, where the later has 
the feature that all densities of time between failure are identical. Where this 
feature exists, then r (t) of Eq. 4-22 and Fig, 4-5 becomes constant and equal to the 
reciprocal of the mean lifetime after several generations [Ref. 20], This is a 
classical problem in renewal theory, but has limited applicability for real world 
reliability analysis problems. 

4.8 Bathtub Curve for Repaired Items 

The familiar bathtub shaped curve, which has previously been discussed in 
Sec. 4.3 for a single lifetime hazard rate where there was no repair or replacement, 
is also used on occasion for repairable equipment. Typically there is no discussion 
of the appropriate mathematical development [Ref. 21 , p. 24]. Such a bathtub 
shaped curve for the repaired item, however, implies a different mathematical model 
than for the non-repairable item. (The repairable equipment is confused with the 
non-repairable part on page 19 of Ref. 22 . ) A mathematical model Which would lead to 
a bathtub shaped curve for repaired items could result from application of the renewal 
rate of Sec. 4.6, which is quite different from that of Sec. 4.3 based on the hazard 
rate. 

In Fig. 4-6 some time between failure density functions are shown for the time 
to the first, time between first and second, and so on. Figure 4-7 shows 
the renewal rate as well as the elapsed operating times to the first, first plus 
second, first plus second plus third, and so on. Figures 4-6 and 4-7 do 
not come directly from data, but are a judgement assumption which is believed to be 
somewhat similar to that which would be found for some electronic equipment. 


34 



Each of the time between failure density functions of Fig. 4-6 is assumed to be 
exponential in shape, but with some differences in the mean time to failure parameters. 
The first two density functions have successively increasing means, the third density 
function on through a very large number, n, have the same mean, and the n+lst and 
successive density functions have decreasing means. Combining these time between 
failure density functions into a renewal rate is illustrated in Fig. 4-6, which has 
the familiar bathtub shape. 



Figure 4-6 Exponential Time Between Failure Densities with Different Means 
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Figure 4-7 The Bathtub-shaped Renewal-rate Curve 
for Repaired Items of Fig. 4-6 


The flat portion of the renewal rate of Fig. 4-6 is the situation often assumed 
in reliability prediction. Here the accumulated operating time does not affect the 
reliability of an equipment, and the reliability model of 

R(t) = e~ X * 

applies regardless of age. This period is also described by the Poisson process 
discussed in Sec. 4.5. Exponential distributions with identical X will always result 
in a constant renewal rate. On the other hand, a constant renewal rate does not mean 
that the times between successive failures have an exponential distribution and that 
a Poisson process exists. Recall that Sec. 4.7 noted that a constant renewal rate 
will ultimately result from any stable distribution of time between failures. 

One reason for going into this discussion of the widely cited bathtub curve 
is to point out that a bathtub curve could arise from other than identical exponen- 
tial distributions. As reliability analysis matures and is extended to a wider 
diversity of commodities it will be increasingly necessary to be aware of the 
possibility that non-exponential distributions might exist. For instance, data from 
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a population of repaired equipment when plotted in histogram fashion might resemble 
the bathtub curve, but the distribution of time between failures need not be 
exponential. Correct choice of underlying distributions can have high implications 
for the accuracy of reliability predictions, for the validity of statistical tests, 
and for the optimization of preventive maintenance actions based on assumed 
distributions . 
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5 . 


Bound-Crossing 

The type of reliability measures treated in this section are those sometimes 
labeled tolerance, drift or degradation, better described as a "bound-crossing" type. 
Items are designed to meet given requirements such as the output voltage of an 
electronic power supply shall be 115 + 1 volt ac and it is designated as failed if 
the voltage crosses one of the bounds 114 and 116 volts ac. In a mechanical system 
it may be desired to estimate the probability that the strength of an item will 
exceed the stress to which it is subjected. In some environments the strength of an 
item will be a function of time as a result of fatigue due to thermal cycling or 
stress cycling. In this case we will be interested in the probability that at the 
mission end the item will have sufficient strength to meet the applied stress. The 
bound in this case is not necessarily a fixed level but may be a distribution of 
stress levels. 

5 . 1 Fundamentals 

5.1.1 Notation 

The notation to be used in this section will be y for a performance charac- 
teristic, s for stress level or environment level, and t for time. The bound will be 
denoted by t for lower and u for upper. 

5.1.2 Bound-Crossing Reliability 

The probability that a performance characteristic y does not exceed the upper 
bound u is denoted by 


p u = p (y 1 “) t 

and the probability that U is exceeded by y is 


1 - P = P (y > u) . 
u 


Similarly, is the probability that y is less 
= P(y <_ £), If the bound has a distribution 
density p(s) for stresses, then the probability 


than the lower bound Z f i.e. 
of values such as the probability 
that y-s exceeds 0, 


R = P(y-s > 0) 


is a measure or index of the performance of the item. To consider more than one 
performance characteristic and stress, vector notation can be replaced for the single 
values y and s respectively, with due consideration for probabilistic dependence. To 
consider time the appropriate distributions become time varying, and additional criteria 
of failure become possible; this will be discussed in Sec. 5.4. 
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5.1.3 Distribution Types 

In order to estimate the probabilities and it is necessary to know the 
distributions of the performance characteristics and stress levels. These distri- 
butions may be any one of several common distribution forms given in Appeiidix A.l, 
e.g. normal, log-normal, uniform, gamma, etc. The selection of the distribution 
form can sometimes be made on the basis of technical considerations such as positive 
and negative deviations of the same magnitude are equally likely (normal) , or that 
the incremental changes are proportional to the measurement value (log-normal) . 

Refer to Ref. 15 for basic assumptions underlying some distribution forms. Often 
the distributions are selected on a subjective basis to describe one’s feeling about 
the variation of the characteristics and perhaps more often they are chosen for 
convenience of the analytical methods. The latter is often not necessary due to the 
capabilities of modern electronic computers and the availability of ’’canned" computer 
programs to perform the necessary analyses as described in Volume II - Computation. 
Time varying distributions for bound-crossing problems introduce additional considera- 
tions which will be covered in Sec. 5.4. 

5.2 Fixed Bounds 

In this situation it is assumed that a distribution form can be selected which 
describes the variation in the performance attribute at some specified time in its 
life when used under certain environmental conditions. The distribution can some- 
times be selected by basic considerations of the physical process, by fitting a few 
distributions by graphical techniques, or by using more sophisticated statistical 
techniques for estimating the distribution parameters. In some cases the form of 
the distribution may not be specified and a distribution free or non-parametric 
method used. The bounds are assumed to be known from technical considerations of the 
application of the Item in the system. 

Example 5-1 

A performance attribute of interest at the end of 10,000 hours 
under specified environmental conditions is normally distributed with 
mean 95 and standard deviation 10 units. For example, this might 
apply to h^, an equivalent circuit h-parameter for a transistor. If 
it is desired that hf e exceed 70, then what is the probability that a 
transistor selected at random from a population of similar items and 
subjected to 10,000 hours of operation under the same conditions will 
have an h^ e greater than 70? 
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fe 


Figure 5-1 Probability Density Function for 


The probability that h^ exceeds 70 is 

P(h r >70) = 1 - P(h. < 70) 

fe fe — 



) 


= 1 - $( 



) 


where u = mean of h _ =95, 

f e 

a = standard deviation of “ 10, 

and $(X; is the area udder the standard normal distribution curve to the left of X. 

In this example X is -2,5 and the area to the left of -2.5 can be obtained from a 
table of areas under a normal curve such as given in standard probability and statis- 
tics books. 


$(-2.5) = 0.0062. 


Thus the probability that h^ exceeds 70 is 

P(h >70) - 1 - 0.0062 = 0.9938. 

f e 

Figure 5-1 illustrates this example. 
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The assumptions under which the above result was obtained are given below and should 
be carefully noted when using these techniques: 

(1) Normal distribution of values of h^, 

(2) Known mean and standard deviation, and 

(3) Conditions of manufacture and operation of items are same as 
those to which the probability estimate is to apply. 

Checking the first and second assumption would depend on the source of the informa- 
tion for the h _ distribution. Often this will come from special tests for this 
fe 

purpose. If so, the first assumption above can be checked graphically by plotting 
the sample distribution function on normal probability paper. The extent to which 
one can check the adequacy of the normality in the region of the tails depends upon 
the amount of data. The second assumption is really never satisfied but for very 
large sample sizes the results would be practically unaltered by using procedures 
which depend on the sample mean x and standard deviation s. The third assumption is 
of special importance to the design engineer in that he will specify the test condi- 
tions to correspond as nearly as possible to those conditions under which he wishes 
to infer the quality concerning the items tested. 

Similar results can be obtained using other forms of distributions such as 
log-normal, Weibull, extreme-value, etc. In each case the "goodness” of the distri- 
bution can be checked by a probability graph of appropriate form and/or by analytical 
techniques such as given in statistical texts, e.g. see Ref. 23. 

3.3 Stress-Strength Model (Bound Distribution) 

In this case the performance of an item is considered to be satisfactory if the 
strength of the item exceeds the stress to which it is to be exposed in application. 
Thus the bounds may not be fixed in that an item selected at random and used in a 
specific system may be subjected to one of a distribution of stresses rather than 
a known fixed stress. In actual practice the stress may vary over the life of the 
item but consider for the moment that an item is subjected to a constant stress over 
its life and that the stress level may vary from item to item. 

The approach to this problem is to specify the stress and the strength density 
functions by one of the methods of Appendix A.l. Then the parameters of the distri- 
butions are estimated and one then computes the estimate of the desired probability. 
Thus in the notation suggested earlier, the strength density function is p(y) and 
the stress by p(s) . Then it is desired to determine the probability that y is larger 
than s . 
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P(y > s) 


or the equivalent 


This is found by 


P(y - s > 0). 


P (y > s) = /p(s) [ Jp(y)dy ]ds (5-1) 

0 s 


where the range of s and of y does not contain negative values and the distributions 
are independent. An example is given below in which both distributions are assumed 
to be independent and normal. In this case the difference y-s is also normally 
distributed and the parameters for this distribution of y-s are given in terms of 
those for the individual distributions of y and s respectively. 


Example 5-2 

Consider a simple stress-strength analysis of a part with strength 
density function assumed to be normal with mean (y) 25K psi and standard 
deviation (a) 3K psi and stress distribution with mean 15K psi and stand- 
ard deviation 2K psi. The two density functions are illustrated in Fig. 5-2. 


Stress pdf Strength pdf 



Figure 5-2 Probability Density Functions for Stress and Strength 
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The probability that the strength exceeds the stress is given by the probability, 

P (y - s > 0) , 

where y is the strength and s is the stress. Now y-s is also normally distributed 
with mean 10K and standard deviation 


a(y-s) = /a 2 ( y) + a 2 ( s) 


= K/9+4 « 3.6K psi. 


and hence 


P(y - s > 0) 


p ( y-s - (1QK) 
3.6K 


-10K 

3.6K 


P(u > 


-10K . 
3 . 6K ' 


where u is a standard normal variable and thus 


P(y - s > 0) = 1 - *(-2.78) - 0.9973. 

One of the major difficulties in stress-strength problems is obtaining suf- 
ficiently good estimates of the stress and strength distributions and hence the 
difference y-s. Given these estimates the problem of estimating the probability is 
a difficult one even if one assumes a normal distribution. Often the difficulty is 
aided by using the estimated safety margin as a measure or index of adequate 
strength-stress margin. The safety margin is 


Safety Margin - ^ ^ — . (5-2) 

/o 2 (y) + a 2 (s) 

* 

5.4 Time Dependency 

The random behavior over time of a performance attribute can be visualized as 
a time-varying probability density function as illustrated in Fig. 5-3. Such sketches 
are sometimes used for data for a part characteristic obtained from life testing. 

Where the criterion of failure is that the performance attribute y(t) go outside some 
fixed bound, then the reliability measure is a straightforward extension of the approach 
in Sec. 5.2 if the performance attribute drift is always either increasing or decreasing 
(monotonic) such as shown in Fig. 5.4. 

Here the reliability is 

R(t) = Prob[y^ < y(t) < y^] 

Lower case letters are used in this report for random processes and variables. 
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Figure 5-3 Drift of y(t) Illustrated as a 
Time-varying Density Function 


y(t) 



Figure 5-4 Examples of Monotonic Drift Behavior 
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which is also 


R(t) = J p(y; t)dy 

y £ 

where the integration is over y at time t and R(t) is a monotonically decreasing 
function. An approximation to R(t) by performing this integration on the p(y; t) 
at selected times t will often suffice. It is stressed that this approach is for a 
monotonic drift. Non-monotonic drift such as shown in Fig. 5-5 introduces additional 
considerations . 

For non-monotonic drift first consider the failure criterion treated above where 
failure is defined as the performance attribute going outside some fixed bound. If 
all that is known is some p ( y ; t) , then the drift reliability cannot be obtained. 

For instance, the p(y; t) at time t^ and at some later time t ^ might be identical, 
but this does not mean that no additional failures have occurred because in a 
population of items some which were out of bound may have drifted back in and others 
may have drifted out. Therefore it is necessary to describe the time-varying 
distributions of the performance attributes with a functional form. Here the 
performance attribute is expressed as a deterministic function y(t) = y(ct; t) where 
the £ are probabilistic with known probability density p ( ct ) • This method can be 
used where the drift failure criterion is a first crossing of a bound for either 
monotonic or non-monotonic drift such as in the above discussions, and it also ban be 
used for other criteria for non-monotonic drift such as the following: 

(1) the cumulated area outside a bound(s), 

(2) the number of crossings of a bound(s), 

(3) the cumulated time outside a bound(s) . 

The approach for non-monotonic drift is to reduce the failure criterion to a first 
crossing problem. A new function w(t) is defined such that reliability becomes 

R(t) = P(y^ < w(t) < y^) . 

As an example, Fig. 5-5 illustrates the last failure criterion of the amount of time 
that y^t) is outside the bounds. Here a corresponding function w(t) is defined, and 
the failure criterion becomes w(t) first crossing a specified level w^. Other w(t) 
functions could be established for other failure criteria. 

An example of a possible mathematical form for describing the performance 
attribute is the polynomial expression y(t) = b^ + b^ t + ... + b^t where the b's 
are random variables of the same sign for monotonic drift and of mixed signs for 
non-monotonic drift. Trigonometric series offer forms for periodically varying 
attributes . 


45 







In some situations where the drift of the performance attribute is non- 
monotonic it may be represented by a stochastic process. Such a situation could be 
the error in a system output. For example, a stationary Markov Gaussian noise 
process may be completely described by its auto-correlation function or its power 
spectral density. An experimental application has been made using this general 
approach for the error in a tilt stabilization assembly for an airborne radar antenna 
[Ref. 24]. A recent theoretical book [Ref. 25] discusses various reliability 
indices for stochastic processes. 

The discussion in this section is about fixed-bounds. The reader who is 
interested in time-dependent problems where the bound is a distribution (such as 
in Sec. 5.3 for the stress-strength problem) would find some guidance in the 
discussion of Part IV. The basic idea would be to treat both the performance 
attribute and the bound as independent variables in a deterministic function. The 
dependent variable then becomes a single performance attribute which has a fixed 
bound for the failure criterion. For example, let w(t) = y(t) - s(t) where y is 
the strength, s is the stress, and w is the new performance attribute which has 
the bound w(t) > 0. 

Two examples follow where the performance attribute is a time-varying normal 
distribution. In these examples the performance attributes are of the form 
y ( t ) - y(ct; t) where the cx are normally distributed. 

Example 5-3 

Suppose that the resistance of a particular electrical resistor 
changes over the interval 0 < t <_ T according to 


where 


r(t) = + a 2 t ohms 


is normal with mean 330ft and standard deviation 33ft, 
a 2 is normal with mean -0.003ft/hour and standard deviation 
0 .OOlft/hour . 
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Lfct r(t) , the resistance at time t be the performance measure of interest 
and hence r(t) is also normally distributed with mean and standard devia- 
tion. 


y{r (t) } = 4330 - 0.003t)ft 

1/2 

o{r(t)> = { (33) 2 + t 2 (.001) 2 > fl. 

= {1089 + i x io- 6 t 2 } 1/2 n. 


For t = 1000 hrs . , 

VI {r( 1000)} 
o{r(1000) } 


3270 

(1090) 1/2 = 330. 


and the density function of resistances at 1000 hrs. is shown 
in Fig. 5-6. 



Figure 5-6 Probability Density Function of Resistance at t = 1000 Hours. 


If the reliability were defined as the probability that the resistance 
lies between 270 and 390, then it would be given by the following, at 
t = 1000 hours, 


R = 



= $(1.91) - $(-1.72) = 0.97 - 0.04 = 0.93, 

where $(x) is the cumulative standard normal distribution function 
obtained from standard texts. 
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Example 5-4 


Suppose that a mechanical part under consideration has a strength 
which decreases with time in accordance with some function of time under 
stress. Let the strength be described by 

y (t) = a^e + a £ (1-e ) 

where a is the initial strength, a ^ the strength as t**°°, k is a 
constant determined by the particular part. Let also 

be normally distributed with mean 50K psi and 
standard deviation 4K psi, 

(*2 be normally distributed with mean 20K psi, 
standard deviation 2K psi and assume that 
it is correlated with y(0), i.e. p = 0*90, 
and 

k = 0.001. 

Thus y(t) is also normally distributed with mean 

u{y (t) } - [ 50 K e"°- 001t + 20K(l-e“-° 01t )]psi, 


and standard deviation 


0{y(t)} = {4 2 K 2 e'°’ 002t + (l-e _ ' 001t ) 2 2 2 K 2 


+ 2(0.90) e _0 - 001t 4 K . 2K(l-e -0,001t ) } 1/2 psi 


„ K , -0.002t . , ,-.001t . .,1/2 . 

= K{5.6e + 6.4 + 4> psi. 


For t = 1000 cycles, y{y(t)} * 31,000 and a{y(t)} = 2670. If the 
prescribed lower limit were y p - 30K psi, what is the maximum number 
of cycles to which the part should be exposed in order that the 
probability of its strength exceeding 30,000 psi will be 0.95? 

To solve this problem we must find t such that 


pty ( t) } - 1.645 • a{y(t)} = 30,000 


-O.OOlt ^ onwl -.001t. , z,- v r , , -0. 002t . . . -O.OOlt . .,1/2 

50K e + 20K(l-e ) - 1.645K [5.6e + 6.4e + 4] 


- 30K = 0. 

By graphing the left member of the above equation one can estimate 
the time t at which the curve crosses the axis and hence obtain a 
more exact solution analytically if desired. 
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Figure 5-7 Strength Versus Elapsed Time 


The number of cycles 
that in this example 
assumed to be highly 


is estimated to be 700, It should be emphasized 
the initial strength and final strength were 
correlated and that was considered in the analysis. 
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6.0 Numerical Index Values 

Guidance is given in this section for obtaining numerical values to be used 
for the various single item reliability indexes which were introduced in Secs. 4 and 5. 
Numerical index values (or data as they are sometimes referred to) result from actual 
measurements, either from operational use or from testing. Reliability index measure- 
ment is at best difficult. Most attempts at it suffer from lack of precision in the 
failure criteria, in recording the operating conditions, and in knowledge of the 
history of use of the item. It is desirable to keep in mind such loose conditions 
under which the data for most reliability indexes was obtained so as not to exaggerate 
their accuracy . 

6.1 Comprehensive Guide 

A recent Navy-sponsored effort to identify reliability data sources gives 
elaboration on sources of reliability data and specific information regarding where 
to direct inquiries [Ref. 26], This is a comprehensive guide and identifies many 
sources for direct reliability indexes as well as for supporting reliability data. 

6.2 Reliability Measure Sources 

Index values for failure rates and other reliability measures as were identified 
in Sec. 4 are treated here. Almost universally these data are for an assumed constant 
failure rate for nonrepairable items and for an assumed mean-time-between-failure of 
the Poisson process for repairable items. 

MIL HDBK 217A . This is a widely used and generally available source. Typically 
the failure rate of the nonrepairable item is for an electronic part and is shown 
graphically as a function of several stresses, with additional multipliers to be used 
for different classes of operational use. The latest revision of this is dated 
December 1, 1965, and is revision A [Ref. 27]; however, as of this date another 
revision is in process. 

MTBF Estimating Relationships . Simple MTBF estimating relationships have been 
developed for electronic equipment and are quite useful for preliminary predictions. 
Here the independent variables may simply be the number of active elements and the 

usage class of the equipment [Refs. 2 and 27]. 

FARADA. The focal point of reliability data is the Failure Rate Data (FARADA) 
program which is sponsored by the Tri-service and NASA in cooperation with qualified 
government contractors. This program is currently conducted by the Naval Fleet 
Missile Systems Analysis and Evaluation Group (FMSAEG) at Coronado, California 
[Ref. 28]. Data inputs from hundreds of sources are collected, compiled and distri- 
buted. The primary distribution is in the form of a series of handbooks. 
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Reliability Analysis Central . The Air Force Rome Air Development Center is 
currently developing a Reliability Analysis Central which is planned to become fully 
operational by mid 1969 [Ref. 29]. The Central is to serve as the Air Force focal 
point for reliability data. 

Non-electronic Data . The data sources noted to this point in Sec. 6.2 are 
primarily electronic in nature. Generally there is more electronic data than for 
other commodities and failure causes. Some compilation of non-electronic reliability 
data are Refs. 30 and 31 sponsored respectively by the Navy and Air Force. The 
Air Force-sponsored work is still in progress. 

6.3 Bound-Crossing Data 

Distribution information to be used with the bound-crossing type of reliability 
measure of Sec. 5 is commented on below. The degradation type of failure mode is 
often not explicitly considered in reliability predictions for electronic items, and 
there are few established data sources for this failure mode. The FARADA program 
and the developing RADC Reliability Analysis Central include degradation and drift 
data under their scope of activity, though not much data are yet included. Equipment 
and system producers who perform degradation studies have, of course, compiled some 
numerical information. Sometimes this is made publicly available to others [Ref. 32]. 
Non-electronic reliability predictions of the bound-crossing variety are principally 
the stress-strength problem. An Air Force-sponsored compilation of appropriate data 
for such predictions has been recently published [Ref. 33]. The data here are 
primarily for distributions of fatigue strength of various mechanical material. 

6 . 4 Remarks 

An undesirable feature of currently available data is that too often it is a 
matter of collecting and passing on what has been reported without very much analysis. 
One reason for this is that the inputs coming into these collection points are often 
lacking in supporting information so that analysis is not possible. As the previously 
mentioned FARADA program continues to progress and as the Reliability Analysis Central 
becomes established, it can be expected that there will be more screening and analysis 
on what is ultimately made generally available. An example of the type of data col- 
lecting and analysis which is desirable was recently performed for NASA and was 
concerned with historical reliability data from inflight spacecraft [Ref. 34], 

Many equipment suppliers are currently collecting and analyzing reliability 
data on equipment which they have produced. This sort of data collection effort is 
extremely desirable and is to be encouraged. If the samples from which such data are 
drawn are of sufficient quantity, the opportunity exists for developing data that can 
be drawn on by the equipment suppliers to give more precise results. 
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The person who is interested in reliability data should keep his eye open in 
the general literature. Occasionally a paper or report will contain preliminary data 
of the sort which is not in the established data sources. As an example, consider 
the human failure mode. A recent paper remarked that a certain percentage of actual 
failures were found to be caused by human error [Ref. 35], Certain reliability pre- 
dictions would be better off to include such failure modes with best available index 
values rather than to omit them. 
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Part III: Multi-Item Problems 


Various approaches for developing reliability prediction equations for system 
reliability as functions of item reliabilities and other variables are presented. 

These are the conventional and classical ones which are suitable for practical 
applications. Inputs to these equations are the single item reliability definitions 
from Part I. 

Section 7 covers logic models, time is explicitly brought into consideration 
In Sec. 8, and Sec. 9 covers the influence of environments which are known probabilis- 
tically and bound-crossing problems. This material will, to varying extents, be 
old-hat to experienced reliability analysts. However, some of it is not stressed 
in existing reliability analysis handbooks or books; including the following: In 

Sec. 7 are the use of cuts and paths for developing prediction models for complex 
configurations and the problem of models for multi-phase missions. In Sec. 8 the 
extreme value approach is discussed in a general manner and a general reliability 
prediction model is derived (first known publication). In Sec. 9 are discussions 
of the influence of environment which is known probabilistically and a specific 
application of this to the multi-item stress-strength problem. 

Reliability prediction equations have the apparent use of providing a numerical 
reliability prediction index for a proposed system configuration. Although the details 
contained In this report explicitly cover only this use, it is well to be aware of 
other applications. These include: Using the model for sensitivity studies in order 

to study the results of changes in input parameters by either limit or probabilistic 
approaches. An approach could involve application of the method of moments such as 
cited in Sec. 9.3.1 for a different problem. Another use is to provide part of the 
equations needed for the application of literal optimization approaches to reliability 
allocation problems. This use prompted the derivation of the general redundancy model 
of Sec. 8.4. Yet another use is that certain practical engineering guidelines can be 
gleaned from studying the models. An instance of this is the outline at the end of 
Sec. 9.2 for multi-item stress-strength problems. 

It should also be noted that the discussions of Secs. 7, 8, and 9 are oriented 
mainly toward bringing items together Into a system. These modeling concepts are, 
of course, the same ones which would be utilized for bringing detailed failure-modes 
together, where an item might have multiple failure modes such as were acknowledged 
in Secs. 4.1 and 4.3. Some treatment of modes will be given in Sec. 7.6 on N-state 
logic models and in Sec. 9.1 on the general question of the influence of environment. 
Part IV will pursue detailed consideration of failure modes. 
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7 . 


Logic Models 

The purpose of this section is to develop prediction models for multi-item 
systems using logic modeling approaches. The system being modeled could be a 
completely general one. Conventionally the system model includes only hardware, 
but the model could be extended to include human operators, environments, signals, 
loads, or other factors which may affect the achievement of system success. 

Although the techniques given are applicable to large complex systems as well as to 
lower level equipments, the discussion will be about systems containing only a 
limited number of items so that it can be followed readily . 

Probability of item success or failure for the logic based system models would 
come from the appropriate measure of Part II. However, attention must be given to 
probabilistic independence assumptions. The approaches in Sec. 7 are for the 
situation where operating conditions (or environments) are assumed to be known, or 
if they are unknown, item reliabilities are independent of this uncertainty. There 
still could be dependence among the probability of success for items at fixed 
operating conditions, and if it exists then it must be reflected in the logic models. 

The reliability logic diagram such as shown in Fig. 7-1 and throughout Sec. 7 
is a useful starting place for the discussion. In the reliability block diagram 
each block is a two-state item (non-failed or failed) . The manner in which the 
blocks are connected describes the non-failed system in terms of the items comprising 
the system. 

Organization of this section is to introduce first the basic set operations 
in Sec. 7.1 and then to apply them to various system configurations throughout the 
remainder of this section. 

7.1 Basic Set Operations and Calculus of Probability 

In order to predict the reliability of a system given the reliability logic 
diagram and the probabilities of success (or failure) of the individual items, 
it is necessary to understand basic set operations and the associated calculus of 
probabilities. See Appendices A. 4 and A. 5 for a brief introduction to these 
techniques and a summary of basic results. For example, suppose that the system 
under consideration is composed of three items, A, B, and C in a series logic as 
illustrated in Fig. 7-1 below. 



-# 


Figure 7-1 Series Logic 
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The successful operation of the system is equivalent to each of the items operating 
or not failing. Let A denote the event that item A is operating, B and C similarly 
denote successful operation of items B and C. In this terminology A represents the 
event of successful item operation. The event that all three items operate is 
denoted by the logical intersection of the events A, B, and C and is denoted by 

A 0 B n C 


or simply 


ABC. * 


Now let the probability that item A operates under stated conditions be P(A) , and 
similarly for B and C. The probability that all three items perform successfully 
is given by 


P(A) P(B) P(C), 

219 tli e events A, B, and C are independent, that is, that the occurrence of A 
does not in any way alter the probability that B occurs, etc. with respect to the 
other events. Further discussion concerning the notion of independence is given in 
Appendix A. 4 . If the events are not independent the probability may be written as 

P (A) p(b|a) p(c|ab) 

where P(b|a) is the probability that B occurs given that A has occurred. In this 
section the independence assumption will be used very liberally because of the 
resulting complexities in not using this assumption, and also because the items 
can sometimes be defined such that the assumption of independence is reasonable. 
However, the user of the techniques should not automatically assume independence 
without some self-questioning. 

If the system consists of three items as illustrated below 



Figure 7-2 Mixed Logic 


The simpler notation i.e. ABC will be used, and the alternate notation i.e. ATBOC 
is cited in this introductory discussion so that the reader will be aware of it, as 
this alternate notation is also widely used. 
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then the components B and C are said to be in parallel. The successful operation of 
the system is equivalent to the operation of (A and B) or (A and C) ; expressed in 
another way it may be stated as the operation of A and (B or C) • Thus the logical 
rules may be stated as 


(A O B) U (A n C) or AB + AC 
or 

A n (B U C) or A (B + C) . 

The symbols for the intersection or product are AHB or AB as were used above for 
series logic and for the sum or union are A U B or A + B. The first expression may 
be obtained from the latter by performing the logical multiplication of A with the 
union of B and C. The probability that the system performs successfully is given 
by the 


P[A(B + C)] 
or 

P (A) P(B + C) 

if A, B, and C are independent. The probability of B + C is the probability that one 
or the other or both of the events B and C are successful. One rule for finding the 
probability of B -f C is 

P(B + C) = P (B) + P(C) - P (BC) . 

This can easily be seen by using the following Venn diagram. Let B and C be denoted 
by the overlapping events as shown below. 



Figure 7-3 Overlapping or Intersecting Events 
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The shaded portion represents the intersection of B and C and if one obtains the 
P(B) and adds the P(C) one sees that the P(BC) is counted twice, thus it must be 
subtracted from the added probabilities to obtain the P(B + C) , which is the 
probability associated with the occurrence of all events enclosed by the boundaries 
of the events B and C. 

Another way in which one can obtain the probability of the successful operation 
of B + C is to use the fact that failure occurs only if both B and C fail, i.e. 

B C . Thus 


P (B + C) =• 1 - P (B C) 

= 1 - P(B) P(C) 

assuming independence. Note that since B and B are complementary events, that is, 
one or the other of these events must occur, then 

P(B) + P(B) = 1 

or 

P(B) = 1 - P (B) . 

The following numerical examples are given for illustrating the notions introduced 
so far. 


Example 7-1 

Let the system be as follows: 



Figure 7-4 Logic Diagram for Example 4-1 
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let P(A) = 99, P(B) = -95, P(C) = .90, and P(D) = .95 and assume 

that the events are independent under the given operating conditions. Then 
the successful operation of the system is given by 


p (S) = P [AB (C + D)] 

= P (AB) [P(C) + P(D) - P(CD) ] 

= 0.9405 [0.90 + 0.95 - 0.855] 

- 0.9405 [0.995] 

~ 0.936 (rounded to 3 decimal places). 


The same result is obtained by using the complementary event and thus 


P(S) = P (AB) [1 - P(C) P (D) ] 

= 0.9405 [1 - (.10) (.05)] - 0.936 as before. 


The latter way is usually simpler and will be used throughout this section with few 
exceptions. Again the reader is cautioned that the use of the above formulas implies 

independence of the events A, B, C, and D. 

The set of items A, B, and C may be considered a success path or path (success 
understood) and likewise A, B, and D constitutes a second success path. The system 
will fail if either A, B, or C and D fail, and these three sets of items constitute 
cuts of the equipment. In Sec. 7.4 the notions of paths and cuts will be used to 
obtain bounds on the probability of success (or failure). 

Certain diagrams may be used to aid in the probability calculations and 
interpretation. Consider the use of a tree diagram for Ex. 4.1 above. 



Figure 7.5 Tree Diagram for Example 4.1 
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The probability of success is given by 


P (A) P (B) [P(CD) + P (DC) + P (CD) ] . 

Such tree diagrams can be easily sketched with experience and the probability 
expressions written down by hand. However, such techniques would be limited to 
relatively simple systems. It will be assumed here that for very complex systems 
one will use a computer program solution. However, the needs exist for a basic 
understanding of the techniques in order not to incorrectly apply a particular 
technique. Ref. 36 presents a more detailed discussion of the tree diagram approach. 

Another approach which can be applied to relatively simple systems is that of 
using Boolean algebra, an algebra of sets. Just as one can perform operations of 
addition with sets or events as above. Ref. 37 presents a complete discussion of 
this approach. A brief discussion of Boolean algebra is given in Appendix A. 5. 

7.2 Applications to Various System Configurations 

In this section the concepts of Sec. 7.1 will be applied to logic configurations 
where the model can be written by simple visual inspection. 

7*2.1 Series Configuration 

If the items of the system are in a series configuration, that is, if each 
item must operate in order that the system will successfully perform its function, 
then the probability of success is given by 

P(S) = P (A A • • • A ) 
i / n 

where there are n components in series configuration as indicated in Fig. 7-6: 



Figure 7-6 Series Configuration 

If the events A^, A^ 9 are independent then 

P(S) = P (A ) P (A ) P (A ) 

± z n 

“ iSl P( V* (7-1) 
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If all the Items have very high reliability a useful approximation is that 

P (S) - 1 - ill P( V* (7-2) 

If all the items are identical then Eq. 7-1 becomes 

P(S) = [P (A) ] n = [1 - P (A) ] n . (7-3) 

In fact it can be shown that the approximation Eq. 7-2 is a lower bound to P(S), i.e., 

P (S) > 1 - n P (A) , 

where all the items are identical. 

7,2.2 Parallel Configuration 

If several items are in parallel, that is, the system operates if one or more 
of the items operate, then the probability of successful operation is given by 

P(S) = 1 - P(all components fail) 

- 1 - T>a ± ), (7-4) 

where a parallel configuration is illustrated below. 



Figure 7-7 Parallel Configuration 
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Another configuration might be one which requires at least k out of n success 
ful items in a parallel configuration in order for the successful system operation. 
In this case the probability of success is given by the following if all the items 
are identical. 

p (S) = ± l k (”) pl ( A > l 1 " p (A)] n_i (if all itmes are identical) (7-5) 

or 

P'S) - 1-Jg (”) pV) [1 - FtA))"-\ 0 - 6 ) 

where (”) is the number of combinations of i items taken from n items, that is, 

M _ n| (7-7) 

\i/ i! (n-i)! * 


Eq. 7-6 would be easier to apply if the k were small compared to n. A similar expres- 
sion may be written in the case of non-identical items, however, the case of identical 
items is more typical. Such formulas are useful for a system such as a nuclear 
reactor in which one needs only a certain number of control rods to shut down the 
reactor, or in the case of an airplane which needs only two engines of four in order 
to take off, or in majority voting logic schemes. 

It is important to note that independence is assumed in the above approaches. 

In particular, if all the items were subjected to a critical environment during the 
mission, then the events of failure may not be independent as assumed above. Similarly, 
if failure of one item increases the stress and thus the probability of failure of 
another item, the independence assumption may not be correct. 


7.2.3 Mixed Configurations 

Parallel-8eries . A parallel-series configuration is as shown below in Fig. 7-8. 



Figure 7-8 Parallel-Series Configuration 









The probability of success is given by using the fact that either A 1 A must 

all operate or B r must all operate or both. The simpliest approaches to 

first apply the series formula replacing A^ . . . , A q by A and B n by B, thus 

reducing to the more simplified versions shown in Fig. 7-9. 





Figure 7-9 Reduction of Configuration in Fig, 7-8 


Thus 


P(A) = i n 1 P(A ± ) 

and 


P(B) = ill P(B i ) * 

Then one uses Eq . 7-4 for parallel configurations to obtain 


P(S) « 1 - P (A) P(B) 


or in expanded form 


P(S) = 1 - [1 - P (A) ] [1 - P(B) ] 


= 1 - [1 - 1 5 1 P(A.)] [1 - i | 1 P(B i )].(7-8) 

In this approach one has a tool for simplifying complex circuits of systems step- 
by-step until it is reduced to a relatively simple logic configuration. The same 
approach will be applied to some other examples below. 

Series-parallel . Let there be a subsystem of m items in parallel and n of these 
subsystems in series as shown in Fig. 7~10. 
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Figure 7.10 Series-Parallel Configuration 


The probability of success for the i th subsystem A i containing m identical 
items in parallel is given by 

m __ 

P(A t ) = 1 ~ J 2 1 P(A ±j ) 

and the new simplified configuration becomes that shown in Fig. 7-11. 



Figure 7.11 Reduction of Configuration in Fig. 7.10 
As the A^ are in a series configuration 

n n m _ 

p(s) = ± n 1 p(a ± ) = i n 1 (i - j n 1 P (A ± j ) ) • (7-9) 

It is not necessary to treat the as being equal to m for all i, and the above 
formula could be generalized by replacing the m by m^, i = 1, n. Many con- 

figurations can be treated by one of the above configurations. Two examples are 
given below to demonstrate some of the formulas, although the examples are worked 
from basic considerations. 
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Example 7-2 

Let the configuration be as shown in Fig. 7-12. 


•- 




where 


Figure 

7-12 Configuration 

P (A) 

= 0.99 


P(B 1 ) 

= p(b 2 ) = 

.90 

P(C) 

= 0.95 


P(D 1 ) 

= p(d 2 ) = 

.98. 


and the events are assumed to be independent . 

Now 

P (S) = P (A) P (B) (1 - P(C) P(D) ) 

where 

P (B) = 1 - P(B 1 ) P(B 2 ) 

P(D) = 1 - P(D) = 1 - P^) P(D 2 ). 

Note that one cannot write P(D) = P(D.^) P(D 2 >, that is the event D fails 
is not equivalent to D.. and D 2 both failing to operate, but that either 
one or the other or both failing to operate. Substituting the numerical 
results yields 


P(S) = (0.99) [ (1 - (0.10)(0.10)] [1 - (0 .05) (1 - .98 2 )] 

= 0.933. 
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Example 7-3 

Let the configuration be as shown in Fig. 7-13 



Figure 7-13 Configuration for Ex. 7-3 
and the associated probabilities be 

P (A) = 0.95, P(C) = 0.98, P^) = P(B 2 ) = .95, 
P (D) = 0.90, P(E 1 ) = P(E 2 ) = 0.90. 

Then if the above is replaced by 



Figure 7-14 Reduction of Configuration for Ex. 7-3 
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0.9476 


P(S 1 ) = P (A) (1 - P(B 1 ) P(B 2 )) 

= CO. 95) (1 - ( . 05) 2 ) = 

P(S 2 ) = P (C) (1 - P(D) P(E)) 

*= (0 . 98) (1 - (0.10)(1 - (.90) 2 )) = 0.961. 


7.3 Conditional Probabilistic Approach 

We have seen from the above discussion that when the reliability logic diagram 
consists of series, parallel, and/or mixed configurations, the mathematical logic 
model can be written directly and easily. However, complex systems cannot always be 
reduced to a convenient configuration as stated above. In such cases it may be 
convenient to use the fact that the probability of success of the system given a 
particular state of the subsystem (which may be for either one item or a collection 
of items or an environmental state) multiplied by the probability that the subsystem 
is in the particular state. This result applies when the states B^, i = 1, . .., n 
of the subsystem are exhaustive and mutually exclusive, that is 

P(B. +B 0 +B q +...+B) = 1 

l z j n 

(the B f s include all possible events or occurrences) 


and 

b ! b j ■ ° 

(the B's are mutually exclusive or have no common occurrences). 

Hence the system success probability P(S) is given by 

P(S) = Z P(S|B 1 ) P(B i ). (7-10) 

The proper selection of the B^, i = 1, . n can aid in the solution of the problem. 
Essentially one wishes to select the states B ± such that the logic diagram reduces to 
a form for which the probabilistic model can be written easily. See Ref. 38. 


Example 7-4 

Consider a system of five (5) items functionally arranged in the 
configuration shown below. The success paths flow from left to right, 
and there are no right to left portions in a success path. Success 
paths are A^A<- , A^A^A^ , and A 2 A^, but A^A^A^ is not a success path. 
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Figure 7-15 Functional Diagram for Ex. 7-4 

The solution using conditional probabilities is given first and then a Boolean 
algebra approach is shown in order to illustrate the difference. 

Using Conditional Probabilities 

Select events B^ = , B^ = A^A^ „ B^ = A^A^ , B^ = ^2^5 are disjoint 

(mutually exclusive) and exhaustive. The selection of items is quite arbitrary. 
One could just as easily write the probabilistic model using other items. Now 

P(B. B ) - 0 i * j, i, j = 1, 2, 3, 4 

and P(B- + B 0 + B + B. ) = 1 

12 3 4 

or 

P(A^A^ + A^A^ 4* A 2 A,- + A 2 A^) = 1. 

The reliability logic diagram can be simplified as indicated below for the various 
states of items A ^ and A,-. The conditional probabilities of success given the 
various states B^ are given in the last column of Table 7-1 where p^ and denote 
the probabilities of successful operation and failure under stated conditions of 
the respective items A_^, i = 1, 5. The system success probability may be 

expressed as 

p(s) = p 2 P 5 d - q 1 q 3 q 4 > + p 2 <15 (p 4> + q 2 P 5 ( p i) + q 2 q 5 (°) 

= P 2 P 5 - P 2 P 5 qiq 3 q 4 + p 2 q 5 p 4 + q 2 p 5 p l 
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and after some algebraic reduction using p = 1 - q 


P(S) = p 1 P 5 + P 2 P 3 P 5 + P2P4 

- P 1 P 2 P 3 P 5 " PiP 2 P4 P 5 " P 2 P 3 P 4 P 5 

+ PlP 2 P 3 P4 p 5- 


Table 7-1 

Conditional Logic Diagrams and Associated Probabilities 

Conditional Probability 
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Using Boolean Algebra 

The Boolean algebra success model is 

P(S) = PjA^ + A 2 [A 3 A 5 + A 4 ]} 

= P{a 1 A 5 + A 2 A 3 A 5 + A 2 A 4 | . 

and expanding using Theorem 4 of Appendix 4 and substituting the item success 
probabilities p will yield the same results as were obtained above using conditional 
probabilities. 

7.4 Models Using Cuts and Paths 

The concept of cut sets and success paths (or tie sets) offers another approach 
to the development of reliability prediction models for systems having complexities. 

In particular this approach is advantageous where the same item may appear more than 
once in the reliability block diagram. Such a situation could arise where a system 
must perform a number of functions but some items are used in more than one function. 
Here a different reliability logic diagram could be prepared for each function where 
the same item will appear more than once. Another situation could arise where 
different functions are to be performed by the system during subsequent mission 
phases, thus leading to a different reliability logic diagram for each phase where 
the same item will appear more than once. The cuts and paths approach can be used 
to obtain an exact model, but this will usually be quite involved and the advantage 
is that an approximate model can be readily developed. The more important results 
are given in this section as derived in Ref. 39. The system reliability is 
defined as the probability of successful function of all of the items in at least 
one tie set or the probability that all cut sets are good. A tie set or success 
path is a directed path from input to output as indicated in the simple system in 
Fig. 7-16A. The tie sets or success paths are 2, 5; 1, 3, 5; and 1, 4, 5, respectively. 
A cut set is a set of items which literally cuts all success paths or tie sets. 

One is normally interested in the minimal cut set; i.e., the smallest or minimal set 
of items such that the elimination of any one item would no longer make it a cut. 

This is because a nonminimal cut set corresponds to more item failures than are 
required to cause system failure. In the above example the minimal cut sets are 
1, 2; 2, 3, 4; 3. Note that 1, 5 is not a minimal cut set since 5 is already a cut 
set and is a subset of 1, 5. A cut set cuts the line of communication between input 
and output. A cut set is good if at least one of its elements is operative. The 
system failure probability or system unreliability is the probability that all tie 
sets are bad (a tie set is bad if at least one item fails) or the probability that 
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at least one cut set is bad (that is, all its items are bad). Hereafter, cut set 
will usually mean minimal cut set. 

Let T^, i = 1, ..., I denote the tie sets, I in number; and C ^ , j - 1, ..., J 
denote the cut sets, J in number. The above statement for system reliability R can 
be expressed as follows. 


R = P{T^ + + • • * + Tj.} - P{at least one tie set is good) 


(7-11) 


or 


R = P{C^C 2 • • • Cj } = P{all cut sets are good}. 

J 

The sets C, , 1 = 1, J contain common items and thus R ^ .11- P{C,}. 

j j=l j 

Equivalently the unreliability is expressed as 


(7-12) 


1 - R = PlT^T^ ••• Tj} = P{all tie sets are bad} 


(7-13) 


or 

1 - R = P{C^ + + • • • + Cj } = P{at least one cut set is bad}. (7-14) 

Similarly the tie sets T^, i = 1, ..., I contain common items and thus 


1 - R + V{T ± }. 



Figure 7-16A Simple Reliability Logic Diagram 
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Figure 7-16B Reliability Graph Corresponding to Functional Logic Diagram 

Formulas 7-12 and 7-13 are not convenient for computation as the cut and tie sets 
contain common items. The probability that all cut sets are good (or that all tie 
sets are bad) cannot be obtained by multiplying the individual probabilities that 
the cut sets are good (or that the tie sets are bad) . The "good" (or "bad") cases 
must be enumerated in order to perform the required computation and the corresponding 
probabilities added. However, this approach does not lend to a computerized approach. 
The formulas 7-11 and 7-14 can be expanded into a sum of probabilities associated 
with one set, two sets, etc. as shown in standard probability texts. These expanded 
forms can then be "chopped off" at desired points to obtain bounds to the system 
reliability. The above are exact formulas for the system reliability and unreliability 
Bounds can be obtained by using the basic probabilistic inequalities given below. 

A computer program, which is described in Vol. II - Computation, has been developed 
for Eqs. 7-20 and 7-21 and for further generalizations of these bounds. 


R = P{T. + T 2 + ••• Tj.} £ E PtTj,}, (7-15) 

R = P{T, + T 0 + ••• T } >_ T. P{T } - l P{T T }, etc. (7-16) 

1 1 1 1 ± 1 <i 2 1 2 


Thus an upper bound and a lower bound to the reliability are respectively 

*U1 = E P{T i } (7-17) 


R L1 = S P{T i } " z P <T. T. }. 


i l < i 2 *1 2 


(7-18) 
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In the same manner another upper bound is obtained 


^2 


Z P{T. } - £ P(T. T. } + £ P{T T T }. (7-19) 

X 1 i 1 <i 2 X 1 X 2 WS 123 


The summations are over all possible combinations of the subscripts taken 2 at a time, 
3 at-a-time, etc. 

Similarly the inequalities of Eqs • 7-13 and 7-16 can be applied to the cut-set 
form of the equation for unreliability of Eq. 7-14 to obtain 

1 - R <_ I ?{C A } 

or 

R > 1 - l P{Cj} 

and by using two terms 

R < 1 - £ P{C.} + £ P(C C > 

3 j x <j 2 3 1 J 2 


^L2 


* 113 ' 


(7-20) 


(7-21) 


Example 7-5 

Consider the reliability graph given in Fig. 7-11. Assume 
independence between items and let the probabilities of success 
for each of the items be p, = 0.93, p 2 - 0.86, p^ = 0.92, p, = 0.95 
p 5 = 0.98. The probabilities for the ties and cuts are as follows: 


and 


Pdp 

= P{2 

5) 

= 

0.8428 

P(T 2 > 

1! 

T) 

3 

5} 

= 0.8385 

P{T 3 > 

= P{1 

4 

5} 

= 0.8658 


p{Ci> = 1 - p{I 2} = 1 - -0098 = 0.9902 

p{c } = l-P{234} = 1-0.00056 = 0.99944 

P{C 3 > = 1 - P{5) = 1 - °- 02 = °- 98 * 
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Upper and lower bounds for the reliability are given by using Eqs. 7-17, 

7-18, 7-19, 7-20, and 7-21, respectively, 

^Ul = > ^ ^ n0t USe ^ U l aS 1 « ) 

\ 1 = 0.843 + 0.838 + 0.866 - P{1 2 3 3} - P{1 2 4 5} - P(1 3 4 5} 

= 0.2848 

R^j 2 “ 0.2848 + 0.6850 = 0.9698 = R (This result is equal 

to the system reliability) 

R L2 = 1 - P{C^} = 1 - 0.03036 = 0.96964 

R u3 = 1-1 P{C } + l P(C C } = 1 - 03036 + 0.00024 - 0.96988. 

J J 1 J 2 

= 0.96988. 

As stated by Messinger [Ref. 39] the bounds based on the cut sets are best 
in the high reliability region and those based on the tie sets are best in the low 
reliability region. Hence the bounds R^ and R^ 3 are the preferred bounds in the 
above example and R^ in this case saves no computation as it is the exact probability 
of system success, as there are only three tie sets and the bound uses all combina- 
tions of tie sets up to and including three sets. 

In more general problems in which there are J cut sets the number of terms to 
be obtained in the lower and upper bounds computations are J and J(J-l)/2 respec- 
tively. This is compared to 2 J -1 terms obtained by expanding either Eq. 7-11 or 
7-14 using tie sets or cut sets respectively. 

7.5 Multi-Phase Mission 

The approaches given thus far in Sec. 7 are applicable to a given mission phase 
and more general treatment must be given to certain multi-phase missions. This 
approach is useful for the type of situation as experienced in a lunar orbit 
mission or lunar landing and return mission in which the environment and the 
configuration changes with the successive phases of the mission. In such a mission 
an item used in several phases may have different probabilities associated with each 
phase. Consider the following configuration for example. 
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Phase 1 Phase 2 Phase 3 


Figure 7-17 Multi-phase Configuration for Ex. 7-6 

If the event of success in phase 1 is denoted by P^), and similarly for phases 
2 and 3 by P(S 2 > and P(S 3 ) , then the probability of mission success P(S) is given 
by the following relationship 

P(S) = P(S 1 |E 1 ) P(S 2 | E 2 ;S 1 ) P(S 3 |E 3 ;S 1 ,S 2 ), 

which can be written in general form for p phases 

P(S) = P(S 1 |E 1 ) •** p ( S pl E p» S i» S 2’ S p-1^ ‘ (7-22) 

These formulations guide the computational procedure so as to include the effects of 
environmental stresses in the jth phase and the previous stress history in phases 

j_2 1, having obtained the probabilities for the items in each of the 

phases. Usually one has to enumerate all of the conditions for each phase and sum 
the products of the conditional probabilities over possible combinations of conditions. 
One approach is patterned after that given in Sec. 7.3 and is illustrated in the 
following example. 
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Example 7-6 


Consider the example given in Fig. 7-17. Let the probabilities 
of success for the various items in phases 1, 2, and 3 of the mission 
be as given in the following table. 



Phase 

_1 


Phase 2 

Phase 3 

1 

p(i|e 1 ) 

= 0.99 


- 

- 

2 

P(2|e i ) 

= 0.95 


- 

P(2|e 3 ,S 1 ,S 2 ) = 0.92 

3 

POlE^ 

= 0.94 

P(3 

|E 2 ,S 1 ) = 0.96 

- 

4 

P(4 |E 1 ) 

= 0.98 

P(4 

|E 2 ,S 1 ) = 0.99 

- 

5 

- 


P (5 

|E 2> S 1 ) = 0.97 


6 

- 


P(6 

|E 2 ,S 1 ) = 0.94 

- 

7 

- 



- 

P(7|E 3 ,S 1 ,S 2 ) = 0.97 

8 

- 



- 

P(8|E 3 ,S 1 ,S 2 ) = 0.96 


For this example consider the various ways in which success in Phase 1 
can occur. They are: 


i) 

all items (1, 2, 3, and 4) 

operate for 

Phase 1, 

2) 

items 

1, 3, 

and 4 operate 

and 2 

fails , 


3) 

items 

1, 2, 

and 4 operate 

and 3 

fails , 


4) 

items 

1, 2, 

and 3 operate 

and 4 

fails , 

and 

5) 

items 

1 and 

2 operate and 

3 and 

4 fail. 



All other combinations of successes and failures will result in failure 
of Phase 1. For each of the above conditions it is necessary to obtain 
the conditional probability of success in Phase 2, and similarly in 
Phase 3. There is a slight simplification in this example in that no 
common items are contained in Phases 2 and 3, hence it is not necessary 
to consider all of the possibilities in Phase 2 prior to obtaining the 
conditional probabilities in Phase 3. Consider now the conditional proba- 
bilities for Phase 2 for each of the conditions given above and in 
environment E 2 . 
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Case 1) P(s 2 |s i , E 2 ) = P (5 or 5*6 or 5 -4 | E 2 ) 

= 1 - P(3) [1 - P(5) {1 - P(6) P(4)}] 

where P(3) indicates the probability of failure of component 3, P(5) success of 
item 5 in Phase 2, etc. 

Case 2) Same as for case 1 as item 2 does not appear in Phase 2; however, Phase 3 
is altered. 

Case 3) P(S 2 |S 1 , E ) = P(5-6 or 5*4) 

= P(5) P(6) + P(5) P(4) - P(4) P(5) P(6) 

Case 4) P(S 2 |s ;L , E 2 > = P(3 or 5-6 |e 2 ) 

= P (3) + P(5) P(6) - P(3) P(5) P(6) 

Case 5) P(S 2 |S 1 , E 2 ) = P(5 • 6 1 E 2 > 

= P(5) P (6) 

Similarly one can analyze Phase 3 subject to the five (5) conditions of success 
in Phase 1. The corresponding conditional probabilities are as follows: 

Case 1) P(S 3 |S 1 S 2 E 3 ) = P(7-8 or 7-2) 

= P ( 7) P(8) + P(7) - P(2) P(7) P(8) 

Case 2) P(S 3 |S 1 S 2 E 3 ) = P(7-8) = P(7) P(8) 

Case 3) Same as Case 1. 

Case 4) Same as Case 1. 

Case 5) Same as Case 1. 

Hence the overall mission reliability P(S) can be obtained by summing the products of 
the conditional probabilities for the respective cases 1) through 5). Thus 
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P(s) 


(0 .86639) ( .99877) ( .9669) 


+ (0.04560) (.99877) (.9312) 

+ (0 .05530) ( .96942) ( .9669) 

+ (0.017681) ( .99647) ( .9669) 

+ (0.001129) ( .9118) ( .9669) 

= 0.949. 

The above approach uses the conditional probabilistic approach of Sec. 7.3. The 
approach can be rather tedious as it usually would be necessary to list all of the 
conditions for each phase and hence the number of different cases would be the 
product of the number of conditions in each phase. 

Because the above approach can be lengthy and tedious, an approximation to the 
mission reliability is possible by use of the method of paths and cuts as described 
in Sec. 7.4. In this approach the reliabilities of the components would be taken to 
be the reliability up to the end of the last phase in which they are used. If the 
probability of failure is assumed to be zero (0) for the phases in which a component 
is not used then, the input reliabilities would be equal to the product of the 
separate conditional probabilities for each phase. 

7.6 N-State Logic Model 

The considerations thus far in Sec. 7 have been based on a two-state model for 
each item, one failed state and a non-f ailed or successful state. In this section 
we consider a case in which some of the items may be considered as having two or more 
failed states, such as opening, shorting, noisy, drift, etc. No additional tools 
are needed to solve a problem of this type; however, the analysis does become more 
complex. One might need to perform such an analysis in order to make correct 
decisions between subsystem configuration. For example. See Ref. 40 which uses a 
two-state and a three-state analysis of a particular circuit. As an example consider 
a diode-quade with a shorting bar as shown in Fig. 7-18. The circuit fails if two 
shorts occur in series (e.g., diodes 1 and 2 or 1 and 4) or if two opens occur in one 
end (e.g., diodes 2 and 4 or 1 and 3). Otherwise the system performs successfully. 

The probability of a diode opening is denoted by p Q shorting by p g . Another technique 
will be used below to obtain the probability of success or failure. It is certain 
that an individual diode will either perform, or short, or open (assuming no other 
mode of failure for this analysis) . Hence 
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p + p 0 + p s 


1 


(7-23) 



Figure 7-18 Diode-Quad with Shorting Bar 

As there are four diodes consider the expansion of Eq. 7-23 to the fourth power. 

Thus 

1 = (p + P 0 + P s ) 4 = P 4 + 4p 3 (p 0 + P s ) + 6p 2 (p 0 + P s ) 2 + 4p(p 0 + P s ) 3 + (p 0 + p g ) 4 
= p 4 + 4p 3 p 0 + 4p 3 p g + 6p 2 p 2 + 12 P 2 P 0 P S + 6p 2 p 2 + 4p(p 3 + 3p 3 p s + 3p Q p 2 + p|) 

+ p 0 + 4p 0 p s + 6p s P s + 4p s p J + p s- 

This expression yields all the various combinations of shorts, opens, and no failures 
for the quad configuration given. The coefficients yield the number of ways in 
which a certain combination can occur. For example, consider the term 

12p2p 0 P s ; 

these are 12 ways of obtaining 1 open, 1 short, and 2 operating diodes. That is, 
there are 4 ways of selecting the shorted diode, 3 ways of selecting the open diode 
from the remaining 3 diodes, and the last two can be selected in only 1 way. Thus 
4 x 3 = 12 ways of obtaining this particular combination. If there is only one short 
and only one open, a failure cannot occur according to the above statement of failure. 
Hence, this term is put into the success probability in the following formula. 

Similarly each term can be treated to determine which portion of the combinations of 
opens and shorts contribute to failure or success. 

1 = (p 4 + 4p 0 p 3 + 4p 3 p g + 4p 2 p 2 + 12p 2 p Q p g + 2p 2 p 2 + 8pp 2 p g + 4pp 0 p 2 ) 



= P(S) + P(F) respectively, 
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where P(S) and P(F) are given in parentheses above. Because p Q and p g are very small 
compared to p the above expressions can be approximated by the following. 

P(S) = p 4 + 4p 0 p 3 + 4p 3 p g , P(F) = 2 P 2 Pq + 4 P 2 Ps ~ 2p 0 + Ap s 

and the actual probability of success is bounded by 

P 4 + 4p 0 p 3 + 4p 3 p s < P(S) < 1 - (2p 2 + 4p2). 

Example 7-7 

Suppose for the diode-quad given in Fig, 7-18 above 
p = 0.99 

p Q = 0.0080 

p = 0.0020 

r s 

.9606 + 0.0388 < P(S) < 1 - (0.000128 + 0.000016) 

0.9994 < P(S) < 0.99986. 

It must be emphasized that independence of the events has been assumed 
throughout the above analysis . If the diode— quad were exposed to a critical environ 
ment in its mission life or if failure of one diode increased the probability of 
failure of another diode, then the probability of success would be altered by the 
appropriate conditional probabilities of failure under the given conditions. 

The above discussion just touches on an important topic area such as an N-state 
analysis. In actual practice an analysis which takes the possible modes of failure 
of each component into consideration and which gives the subsystem behavior for each 
failure mode would result in an extremely large number of cases to examine. This can 
be true even for a two state analysis. Hence one must cope with the dimensionality 
problem by first identifying the more likely weaknesses of the equipment and then to 
perform a detailed analysis on these components such as an analysis of a particular 
redundant configuration as for the diode— quade. The logical operations and the 
probability analysis for the N-state situation are more complex than that for a two- 
state analysis but the same basis techniques are applicable. 
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8. 


Models Considering Time 

In this section the explicit use of time is considered for multi-item problems. 

A straightforward approach is to develop a logic model as in Sec. 7 where item suc- 
cess and failure probabilities are expressed probabilistically as attributes, and 
then to substitute for each attribute the appropriate time measure as described in 
Sec. 4.2. This will be discussed first in Sec. 8.1. For some problems the sub- 
stitution approach is not applicable, and a more involved convolution approach is 
discussed in Sec. 8.2 for these problems. The approaches presented in Secs. 8.1 and 
8.2 can be used for the first time to failure of a system where the individual items 
can have many possible time to failure distributions such as gamma or log-normal. 
However, most often it is assumed that all the items have exponential failure dis- 
tributions. Where this assumption is made, the system reliability prediction models 
of Sec. 8.1 and 8.2 are applicable regardless of how much operating time has been 
accumulated and if it is known that all items in a system are non-f ailed. Further, 
if the exponential failure distribution is assumed for all items, then the methods of 
continuous Markov processes and difference equations can be used to develop reliability 
models without first developing a logic model. This approach is acknowledged in 
Sec. 8.3, along with other approaches which are somewhat specialized. The final 
Sec. 8.4 contains the development of a general redundancy equation which is suitable 
for general reliability prediction and which also may be used for reliability alloca- 
tion decisions. 

8.1 Logic Model Substitution 

The logic form of reliability prediction models can be readily extended to 
explicitly consider time. This is done by simply substituting the applicable 
probabilities of success or failure as functions of time, R(t) or F(t) , for each 
item, into the multi-item logic model. Such a substitution is possible where the 
R(t) or F ( t) for each item is applicable for the time t of interest, which means that 
reliability prediction models for certain systems such as the classical standby 
redundancy and the rope models cannot be developed by this method. The following 
sections will treat this and considerations other than logic based model substitutions. 
In this section several of the logic based models from Sec. 7 will be extended via 
examples to consider time explicitly. Those not treated can be readily developed 
and are shown in most reliability books and handbooks. Exponential failure time 
distributions for items will be used because of its conventional emphasis. Other 
distributions can be readily substituted for first-failure time models, but as they 
lead to complications if later failures are explicitly considered they are not so 
widely used. 
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Series System . If all the items of a system must operate in order for the 
system to perform its intended function, then the items are said to be in a series 
system. In Sec. 7.2.1 it was stated that the probability that n items A^, A^, . . . , A^ 
operate, assuming independence, is given by 


P(s) = i | 1 P(A ± ) . 

If item A^ has a mean time between failures (MTBF) of 0^ or a failure rate A^(=l/0_^) 

and if the mission time is T.,, then 

M 

" X i T M -V 6 i 

R(A. ) - P( component A. survives time T w ) = e = e 

i i M 


provided the A^ is constant throughout the entire mission. If A^ changes with the 
mission phases one must perform the computations for each phase separately for 
non-serial systems. Some simplification can be made in this procedure. The mission 
success probability is given by 


P(S) 


n 

A 


- X i T M 


"V X i 


(8-1) 


that is, the failure rates can be added for the n components in series to obtain an 
overall system failure rate. If some failure time distribution other than the 
exponential is appropriate the R(A^) can be expressed as the appropriate integral 
of the density function. These integrals are tabulated for almost all density 
functions of interest in many standard statistics texts. 

Parallel Configuration . If n items are in parallel then system success is 
equivalent to at least one item operating. Another way of stating system success is 
that the items do not all fail. Using this logical form the following result is 
obtained 

P (S) = 1 - ± 2 1 P(A ± ) 


= 1 - 


n 

iSi 


n -h T M- 

(1-e ) 


( 8 - 2 ) 


For small values of A^T^ (all i) the following approximation can be used to simplify 
the above calculations. 


1 


- X i T M 


1 - (1 - 


X i T M 


\ 2'r2 

Vm 

2! 


...) 


K x iV 
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Using this approximation 


P(S) * 1 - jj X^ for X^T^ very small. (8-3) 

8.2 Standby and Rope Models 

Development of reliability prediction models for some systems cannot be 
accomplished by substitution in logic models. Such systems are those where all 
the items are not used throughout the time interval of interest (standby redundancy) 
and where the probability of success for some items change at uncertain times in the 
time interval (rope redundancy) of interest. For these situations prediction models 
can be developed using the convolution concept. 

8.2.1 Standby Redundancy 
Case 1- Perfect Switch. 

Suppose that a system consists of in items, m-1 on standby, for use when one of the 
items fails in use as indicated in Fig. 8-1. 



Figure 8-1 Standby System With m-1 Items on Standby 
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It is assumed for the present that the switchover devices are 100 percent reliable 
(i.e. that the failure rate is zero in the standby operation), and that each item 
has an exponential failure time distribution with failure rate A^. Let be the 
mission time. Now if the system Is to perform its function for the time T^, the total 
of the operating times must exceed T^. If t^, t 2 , ♦ . * , t are the times to failure 
of each of the respective components, then the probability of successful operation of 
the system P(S) is equivalent to the probability that the cumulated failure times 
of the m components exceeds T^, or 

P(S) - P(t, + t 0 + ... + t > T m ). 

± z m — M 

Consider this problem for the case m = 2, in which it is necessary to obtain the 
probability that 11 = ^ + ^ Now 

**^‘l t l 

p(tl) = A 1 e , 0 <_ t 1 < «> 

p(t 2 ) = A 2 e , 0 £ t 2 < « 


and the probability that t^ + t 2 >_ is given by the double integral 


P(S) 


1 - P (F) = 1 - + t 2 < T m ) 


= i - v 2 / / 1 
0 0 

where the region of integration is shown In Fig. 8-2. 


t t-t, -A t -A t 

e e dt^dt 1 

0 0 1 1 


(8-4) 



Figure 8-2 Region of Integration 
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Integration of the equation (8-1) yields 


, ' X 1 T M A 1 r A 1 T M X 2 T M , 

e Vh e " e ' 2 1 


P(s) = 


" x i t m 

e [1 + X iTm ] , 


If X 2 = \ . 


(8-5) 


The above formula can be interpreted as the probability that item 1 survives the 
entire mission time plus the probability that item 1 fails in time t^ but that 
t 0 > - t. (the contribution of the second term) . 

In case the items are all identical and perfect switching exists then the 
probability that a system of m components (m-1 standby components) survives is 
given by 


P(S) 


P(t l + C 2 + + Sn — T M } 


-XT 


\2rr>2 I 10 ^-rn 10 1 

• + + <^r — >• 


( 8 - 6 ) 


Note that this formula gives the probability of 0, 1, 2, . .., m-1 failures for a 
variable having the Poisson distribution with mean number of failures given by XT^. 

Case 2- Imperfect Switch . If imperfect switching were taken into consideration the 
second term in the above would have to be multiplied by the probability that the 
switch-over occurs, P(sw) say, and hence 


P (S) = e 1 M P(sw) [e 1 -e W ] 

V A 1 


(8-7) 


See Ref. 41 for a statement of the above result when several standby components 
are allowed. Also see Sec. 8.4 for a more general formula for combinations of 
redundancy . 

8.2.2 Rope Model 

In some physical situations system failure does not occur until all or k out 
of n items fail (for example, as strands in a rope), but the failure of some of the 
items increases the stresses on the remaining items and thereby decreases their 
reliability . 
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Case 1 : Suppose that the load on a system is constant and that initially n items are 

sharing the load. As the elements fail the remaining load is equally shared by the 
remaining elements. Thus if the original stress per element is Sp/n, then the 
subsequent stresses increase Sp/(n-l) for 1 failure, Sp/(n-2) for two failures, etc. 
The increase in stress on each item will usually result in a corresponding increase 
in the failure rate for the items as the ratio of the operating stress to the rated 
stress increases. Let the stress ratio be h as given by 

h _ operating stress = Sp/(n f) ^ S Q 

rated stress S s S 

r r 


where f is the number of failures and s(**n-f) is the number of survivors. If the 
rated stress is exceeded by Sp/s then the system is assumed to fail. Let the maximum 
number of failures be n - k, or k be the number of minimum number of items for 
operation. Thus for non-f ailed operation the stress ratio must be less than unity, 
i.e. 


h 


S /k 
— — < 


1 , 


or 


s « n-f >_ k. 

Now suppose that the failure rate for an individual item at time t for stress ratio 
h is denoted by A(t; h) . 

In this first case assume that 

A( t; h) * Aph , 

that is, A increases linearly with h, Ap is a constant. The failure rate for the 
system Ag is given by 

x s “ sX o h 

where s is the number of non-f ailed or successful items . Now 


s S 


and thus 


x o s o 

s S 


s o x o 
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which is constant. Ref. 42 treats this case and Ref. 43 has included it as a 
special case of finding the reliability of a parallel redundant system when the item 
failure rate is A = A(h) , a general function of the stress ratio. 

Thus in this case of constant system failure rate the time to failure of the 
system is given by 


T s h + t 2 + . . . + t f 

where f is the number of failures. Now if each t^ is assumed to have the exponential 
failure time density function, i.e. 

p(t ± ) = A g exp(-Ag t ± ) , 

the distribution of T c is given by the f fold convolution of p(t ) , 


For n = 2 items, 


p(T s ) = p(tj) * p(t 2 ) * ... * p(t f ). 


p(Tg) = p(t x ) + p(t 2 ) 


/ p(tj) p(T s -t 1 )dt 1 


*r° 

i2 t " A S T S 
A 1 T S ' 


Similarly for the f fold convolution one obtains 


P(T C ) = [p(t)] J 


^sV ' 1 ex P ( - > s T S ) 
r (f ) 


, T s > °. 


( 8 - 8 ) 


This is the gamma density function with shape parameter f and the same scale parameter 
X as for the exponential distribution. See Ref. 44 for further details in the deri- 
vation of the distribution. For f m n-k+1 yields 


p<v 


. n-k+1 _ n-k , x 

X s Tg exp{-X s T s > 

r (n-k+1) 


where T is the time the n-k+lst failure occurs. 

u 
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Case 2 : Suppose that the failure rate of an individual item is of the general form 


X = A(h) 

of the stress ratio h, where A(h) is not necessarily linear as indicated in Fig. 8-3. 
The result is given in Ref. 42 in the form of a complex integral with values of the 
residues to be determined. 



8.3 Additional Approaches 

Several other approaches which are used for deriving reliability models explicitly 
concerning time are briefly identified. 

8.3.1 Continuous Markov Process 

Another method of deriving conventional reliability models when all items in 
the system have an exponential distribution is to use the approach of a first order 
Markov process and difference equations. A text [Ref. 3] is devoted mainly to 
the derivation of models based on this approach. A space-state diagram relates the 
possible transitions between the possible system states. The postulate is applied: 
the probability of a state change during (t, t+dt) is Adt plus terms of smaller 
order than dt and the probability that more than one change occurs is smaller than 
dt. This approach leads to a set of linear homogeneous differential equations, which 
can be solved for the probability of success as a continuous function of time. Thus 
it is the approach used in Sec. 4.5 for the development of the Poisson process. 

Different system configurations (e.g. series, active-parallel, and standby-parallel) 
lead to different success probability functions, which are identical to those 
obtained from the approach in the preceding Secs. 8.1 and 8.2. 


88 



The Markov process approach can be readily extended to include maintenance, 
which is really the advantage of this type of model formulation. Here the state- 
space transition diagram is expanded from only failure transitions to include both 
failure and repair transitions. The same postulate can be applied to repair as was 
applied to failure, resulting in an expanded set of differential equations. These 
can be solved for availability formulas. This Markov process formulation is thus 
best suited for system level modeling where both maintainability and reliability 
are to be explicitly considered, but where the operational profile and the system 
are not so complex that an analytical approach becomes unwieldy. 

8.3.2 Extreme Value Theory 

An approach for obtaining certain prediction equations can be based on concepts 
of order statistics when the lifetime distribution of all items are identical and 
independent. Here the probability density function is derived for the particular 
item which, when it fails, will fail the system. For each of the following systems 
this item is: 

(1) Series: Shortest lifetime pdf from n series items. 

(2) Parallel: Largest lifetime pdf from l parallel items. 

(3) Series Strings in Parallel: Largest lifetime pdf from l 

items from the shortest lifetime from n items. 

(4) Parallel in Series String: Smallest lifetime pdf from 

n items from the largest lifetime from l items. 

As in most practical problems all items do not have identical pdf's, the general 
applicability of this approach is restricted. 


Example 8-1 

If a system consists of n items in series, e.g., linked together 
in the form of a chain, the lifetime of the chain cannot be more than 
that of the weakest link. The life length distribution of the chain 
would be that of the shortest life length. Ref. 43 treats this 
problem. The probability that the shortest life is less than t is 
given by 

1 - Prob{lives of all n items are greater than or 
equal to t} 

1 - [1 - F(t)] n = 1 - R n (t) 

n [1 - F(t)] n 1 p(t) . 

1 - F(t) is the reliability of a single item 
lifetime density function of a single item. 


F x (t) = 
P x (t) = 

where R(t) = 

p(t) = 
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Suppose that the distribution function for the lifetime of each member of the 
chain is Weibull, i.e. 


then 


and the 


F(t) 


1 - exp{- ( 



6 

) }, for x 


i y, 


6 

F,(t) = 1 - exp{- n( ^ ) } 

x n 

P]/*) = "p ( ) exp{-n(^^ ) }. 


8.3,2 Flowgraphs 

Flowgraphs are a graphical method of representing simultaneously a set of 
equations which have been applied to electronic and other engineering problems. 

They augment a classical mathematical approach. There has been some exploratory 
application of flowgraph techniques to the development of reliability prediction 
models [Refs. 45 and 46] but this approach is not widely used. An advantage of 
a flowgraph approach would be that if one is already skilled in their use for 
engineering problems then this may be a ready method for learning about the development 
of reliability equations. 

8.4 General Redundancy Model 

Three of the redundancy models which have been introduced are those for: 

(1) all items functioning, i.e. Eq. 8-2 which will be referred 
to as items in parallel, 

(2) standby redundancy where there is a "perfect" switch, i.e. 

Eq. 8-6, which will be referred to as spares, and 

(3) standby redundancy where there is a switch, i.e. Eq. 8-7, 
which will continue to be referred to as standby redundancy. 

Interest is with a general reliability model for parallel arrangements of identical 
items of any of these three redundancy approaches where the failure criterion can be 
one or more items must work. In addition to reliability prediction this model 

can also be used for the general allocation problem concerning optimum selection of 
a redundant configuration. This model is an input for a general reliability cost 
tradeoff program (RECTA) which is covered in Volume II - Computation. 
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In this section the following notation is used: 
n identical items in parallel, 

m identical spares, 

r identical items in standby redundancy, 

Uq number of items that must work, 

p switch reliability, 

s the number of switches which work, 

t the mission time, and 

X the failure rate. 

The general formulas for the cases in which (a) n^ = 1 and (b) n Q > 1 are 
derived separately. Although the first case is a special case of the latter, case 

(b) , it is useful to derive the simpler case first for a better understanding of the 

more general formula. 

8.4.1 Reliability of a System for n Q = 1 

In this section the general formula is derived for the situation that only 

one item must work. 

The probability that s switches work is given by the binomial formula 

(:) p s «-» r ' s - 

If s switches work then the m spares plus the s items in standby result in m+s items 
on "standby", (manual or automatic). Thus the reliability is given by the 
formula 

R e = s=0 [(s) P S ^-P )r " S V n ’ m+S; °] * 

where R^(n, m+s; t) is the reliability for a mission of length t given s switches 
work, n active items and m spares are available, and hence m+s standby items. The 
reliability is given by three cases: 

Cas e 1 ; m+s = 0. In this case the reliability is given simply by the probability 
that at least one of the n active items survives time t, that is, 

, r-, -*t, n 

- 1 - [1-e ] . 


91 



Case_2: m+s - 1. In this case the reliability is the probability that the n 
active items plus the one (1) standby item survive time t, or 


where 


R * = 1 


A(j) = 




e jXt A(j), 


1 t -At 2 (l-j) 


(m+s-1) ! 


/ 


v m+s-1 


(At 2 ) dCAt^), general formula. 


t -xt 2 (i-j) 

~ J e d(A t 0 ) for m+s = 1 

0 Z 


and for specific values of j we obtain, 

A(0) = l-e~ Xt , 


A(l) = At, and 


A(j) = (e (J 1)Xt -l)/(j-l), j = 2, .... n. 


C ase 3 : m+s = 2, 3, . . . , » or m+s is a positive integer larger than 1. 

If m+s is larger than one (m+s>l) the formula for is the same as the above with 
the exception that the A(j) are given by the following formulas which include Case 2 
as a special case. 


A(0) 


-At ^ , 

£ c / \ vin+s-1 f - x .m+s -2 

frn+s-1) ! + (m+s-1) (At) 


+ . . . + (m+s-1) ! J , 


A(l) 

A(j) 


(At) m S /(m+s) ! , and 


(j-l)At 


(m+s-1) !(j-l) 


m+s 


{[(j-l)At] m+S_1 - (m+s-1) [ ( j-1) At] m+S-2 


+ ... + (- l )™^- 1 (m+s-1)!} + - ( 1) m ■ , j = 2 , 3 , .... n 


(j-1) 


m+s 
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Derivation ♦ The derivation of the expression for is given in the following 
discussion. First consider the probability that a system of n items in parallel 
(all active) will survive time t. 

the n items and hence what is required is the probability that t 


Let be the longest time of survival for 


(n) 


> t. 


The 


V*-*/ 

probability that all n items fail in the interval (0, t) is given by 


F]_ ( t) = p { t ( n ) It} = F n (t) (8-9) 

and the probability that at least one item survives time t is given by 1 - F^(t) 


P{t, 


-At .n 


-(n) > t} = 1 - U-e 1 (8-10) 

The probability density function for t^ is given by differentiating F (t) to yield 


Pl(ti) 


n[ 1-e 



n-1 



where t n is substituted for t , v for convenience. 

(n) 

It is now desired to find the time to failure distribution for the m+s 
"spares” in order to find the total survival time for active and spare items. It 
is assumed that the n parallel active items have all failed at time t^ and then the 
nrl-s spares will be used one-at-a-time until all have failed. Thus we want the 
probability density of the time to failure of these m+s spares with the assumption 
that one of them is used immediately, at time zero for the spares. The survival 
time is the sum of m+s— 1 times each of which has an exponential failure time density 
function. Hence the frequency function for the sum (t 2 ) is the Gamma distribution 


-Xt ? , 

P 2 (t 2 ) = Ae (At 2 ) s /(ra+s-1)! 

where t 2 is used to denote the survival time of m+s "spares", automatic and/or manual. 
The reliability is given by the probability that the sum of the two survival times 
as described above, t^ + t^, is larger than or equal to t, i.e., 


P(t 1 + t 2 >_ t) . 

The probability that the sum is less than t is given by the convolution integral 


t 

/ P,(t,) F (t-t-)dt,, 

t 2 =o 1 L 1 


where F^(t-t 2 ) is obtained by substitution in Eq . 8-9 above. 
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t -xt, (Xt ) m+s 1 -X(t-t ) 

/*• (m+s ^I)T ][1 ~ 6 1 dt 2 


t -Xt, (Xt,) 

/[ • 2 2 


m+s-1 


0 


(m+s-1) ! 


][ 


x 0) 


(-1) J e 


-jX (t-t,) 


] d(Xt 2 ) 


Hence, for m+s >_ 1, 


3=° (j) (_i)j e jxt (»>+s-i) i i 


t — Xt~ jXt^ . e i 

"e 2 e 2 (Xt 2 ) 8 1 d(Xt 2 ) 


l-R^(n,nri-s ,t) = P(t x + t 2 1 t) = Q (-l) j e" jXt A(j), 


( 8 - 11 ) 


where 


*<J> ■ (S^I5T / " t2<1_1) at,)""- 1 * * a<xt 2 ). 

If m+s - 0 . There is no need for A(j) , j = 0, 1, .... n, and we use Eq. 8-10 
for n items in active parallel. 

If m+s = 1. For j ? 1 but an integer greater than or equal to zero 


A(j) = / 


t -Xt, (1-j) 


d(Xt 2 ) 


1 n _ -Xt(l-j), 

(1-j) [ J ’ 


and for j = 1 


A(l) = / d(Xt,) = Xt. 


If nH-s > 1. 


t -Xt 


A(0) = 


^7)7 J e “ 2 (Xt,)^ 8 - 1 d(Xt 2 ) 


(m+s-1) ! J q 


= l-e _Xt [(Xt)”* 8 " 1 + (m+s-1) (Xt) m+S_2 + ... + (m+s-1)!]/ (m+s-1)! 


94 



m+s 


AM) = i f (\ t )®+s-l ,,, . _ (At) 

AU; (m+s-1) I J d(Xt 2 ) " (m+s) ! 


t At„ 


A<2) = '(m+s-1) ! / e 2 (^ 9 ) m+S ~ 1 d(At 9 ) 


A(2) = - (m+s-1) 1 e Xt (^ t > m+S " 1 " (m+s-1) (At) m+S - 2 


+ ... + (-l) m+S 1 (m+s-1)!] + (-l) m+S 


or in general for j >_ 2 


A(j) = 


(j-l)At r 

^ [[(j-DXt] m+S 1 - (m+s-1) [(j-DAt] 1 ^ 8 ' 2 


(m+s-1) ! (j-1) 


+ . . . + (-l) m+S X (m+s 


-1) J + 


(-D 


m+s 


(j-D 


m+s 


Having obtained all A ^ , for j =0, 1, n the results are substituted into Eq . £ 

to obtain 

P(t l + t 2 - t) > 

and then the desired probability is the reliability R^ , that is 
R Jl (n,i)ri-s,t) = 1 - P(t 1 + t, < t). 


This result must be obtained for each possible s and used in the formula for the 
reliability of an item, 


R 

e s 


fc [(I) 


P s (i- P ) r - S R £ ( 


n,m+s ,t)J 


(8-12) 


-11 
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8.4.2 Reliability of a System for n Q >_ 1 

Suppose that n Q items must operate in order for a system to properly perform 
its function. In the previous derivation n Q = 1 and the distribution of life with 
n items in active parallel was given by the maximum life for the n items. In this 
case the time to failure is given as the time to failure of the n - n Q + 1th item. 
The probability that the n - n Q + ltfi item fails in the interval (t, t+dt) is given 
by 

* ( V i).U 0 )i - 1F(t)1 °" 0 n-F(t)] n ° 1 p(t)dt, 

where the probability density function and distribution function for a single item 
are 

p(t) = Xe , 

. . , -Xt 

F(t) = 1 - e 


Thus 


p (t) = C(n, n fi )Xe 


-Xt(n -1+1) - Xtl n " n 0 

[1 - e J 


-Xtn 


= C(n, n Q )Xe 


n n o ( n ~ n o) 

k=0 \ k / 


. .k -kXt 
(-1) e 


(8-13) 


where 


C(n,n Q ) = nl/[(n Q -l)! (n-n Q )!]. 

The distribution function of the time to the n-n Q Ibh failure can be obtained by 
integration of p^(t) , 

-Xt(k+n ) 


F 3 (t) = C(n 


. »„> l 2 " ( 7 °) <-«' 


Xdt , 


. , n : n 0 / n_n o\ . ,.k+l 1 r 

- C(n, n Q ) k I Q \ k / (k+n Q ) 


-xt 3 (k + n 0 ) 

e - 1] » 


where t 3 is used to denote the life— length of the n active items. Hence 


F 3 (t 3 ) = C(n 


n : n o / n n o\ f _i^+l 

" n 0 ) k=0 \ k / ( 


(k+n Q ) 


-Xt,(k+n n ) 

1 e 3 ° + B(n, n Q ) 
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where 


<„, n 0 ) - C(», » 0 ) °|"° [(” k ”°) <-D 


k 1 
(k+n Q ) . 


It is now possible to derive the distribution of t^ + t^ where t^ is the time 
to failure of the m+s M spares M and t^ is the time of failure of the n active items 
in parallel of which n^ must survive. Hence, by convolution of these two distribu- 
tions the distribution of t = t 2 + t^ is given by the integral 


/ P 2 (t 2 ) F 3 (t - t 2 )dt £ 


P{t 2 + t 3 < t} 


t r -At„ (At -)®* 8 1 [" n-n /n-n \ 

I L Xe (m + s - 1) ! J * L C < n ’ n 0 ) k=0 \ k / 


.k+1 1 A(t-t 2 )(k+n 0 ) 

( - u THV ' e +B( 


n, n Q )j dt^. 


+ t 3 £ t} 


C(n>n ) I (-1) r - — 
o k=0 k+n r 


-Xt(k+n Q ) 

A(0) - e A(k+n^) 


where A(0) and A(k+n Q ) are obtained by using the previously derived equations for 
A(j) for j = 0 and for j = k+n^ . 
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9. Environment and Bound-Crossing Problems 

In this section various approaches which have been covered thus far in Parts 
II and III are brought together. Mainly the material is concerned with a multi-item 
system which is to operate in an environment which is known probabilistically or 
which is comprised of functionally related items. In particular, practical conclu- 
sions which can result from the reliability prediction analyses are noted at the ends 
of Secs. 9.2 and 9.3. This is the final section of this report concerned with 
analytical detail which has immediate practical significance. The following sections 
of Part IV mainly present the results of investigations on approaches for bringing 
into the analysis more detailed information bearing on the dependence question. 


9.1 Environment Described Probabilistically 

System reliability logic models such as those developed in Sec. 7 when all 
items are independent can be expressed in functional notation as 


R = R(R) , R = (R 


1 ’ 


Rj > • • « » R n ) 


where R is the reliability for a system and each R ^ , j = 1, . . . , n is the reliability 
of a single item. If each is conditional on environment R^ |s^ and if the probabil- 
ity density of the environments p(s_) is known, then the unconditional system reliability 
is the expected value, 


E(R) = / R(R) p(s)dis. (9-1) 

s^ 

This is the extension to multiple item models of the approach noted by Eq. 4-15 for 
single items. 

Eq. 9-1 would be applied to the situation in which each item is used at the 
same environment and the environment is described by a probability density p(s) . 

Note that this means that the average reliability of each item cannot be obtained 
separately (using Eq. 4-15) and this average reliability substituted into the system 
reliability equation R = R(R) . That is 


/(R 1 |s)(R 2 |s) p(£)ds_ ^ /(R-Js) P(s)ds /(R 2 |s_) v(s)ds_ 

s_ £ S 

for the simple case of two serial items. Whether or not using the incorrect separate 
approach yields conservative or optimistic results depends on the details of the 
particular problem. Some generalized statements of this sort have been developed 
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for certain multi-item stress-strength problems and they will be noted in the follow- 
ing section. 

The system reliability model R = R(R) may result from any of the configurations 
and approaches noted in Sec. 7. The item reliability conditional on environment 
could result from testing. An item reliability measure could be of the form of the 
various measures of Sec. 4, or it could be based on the bound- crossing concepts 
of Sec. 5. For bound-crossing concepts where the bound is fixed, such as in Sec. 5.2 
the performance attribute y and the environment s_ need to be dependent, i.e., 
p(y>s) ^ p(y) p(s) , in order for the item reliability to be conditional on environ- 
ment. Application of the fixed bound would give 

y u 

/ p(y | s) dy = r|s^ . 


This resulting bound-crossing based reliability measure can then be readily inserted 
into Eq. 9-1 as an R^ |s^ along with other reliability measures for a multi-item system 

An expanded reliability definition which is essentially an elaboration on 
Eq. 9-1 has been proposed in Ref. 47 where the orientation was for catastrophic and 
drift failure modes for an item in a probabilistic environment. Eq . 9-1 is thus the 
basis of an approach where there is a probabilistic environment if the orientation 
is for separate physical items where there is a reliability measure for each item 
such as has been the viewpoint throughout Part III, or where the orientation is 
for separate failure modes where multiple modes are specifically identified as 
in Secs. 4.1 and 4.3, in Ref. 47, and developed in Part IV. 

For stress-strength problems where the bound is a distribution such as in 
Sec. 5.3 the item reliability is always conditional on the environment (stress). 
Extreme value approaches cited in Sec. 8.3.2 for obtaining system reliability 
models which explicitly considered time are also applicable for certain multi-item 
stress strength problems. 

The following section will expand on the multi-item stress-strength problem 
using detailed illustrations. 


9.2 Stress-Strength Problems 

Multi-item stress-strength problems considered here will demonstrate an 
application of the more general remarks made in Sec. 9.1. The general problem area 
is the extension of the single item stress-strength reliability measure of Sec. 5.3 
to a multi-item system. The potential mistake in reliability prediction here is to 
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obtain separately the reliability of each item in a system such as using Eq. 4-15, 
and then to substitute this into system logic models as those from Sec. 7. The 
correct development of multi-item stress-strength problems is presented in Section 
9.2.1 and this is followed with some important practical conclusions in Section 
9.2.2. 


9.2.1 Prediction Approaches 

Two basic approaches are used: (1) the calculation of the conditional proba- 

bility that the strength exceeds a given stress and then integrating this result 
over the assumed stress distribution, and (2) the derivation of the probability 
density function (pdf) for the smallest (or largest) value of strength in a sample 
of n items, and then using the joint distribution of this density with that of stress 
to obtain the desired probability. Mathematically the first computation can be ex- 
pressed as follows: 

(1) Obtain the probability that y > s^ for a single item, i.e. 

00 

P(y > s Q ) = / p (y)dy , 

S 0 

where s^ is the fixed stress level, and p(y) is the pdf of strength. 

The examples will use uniform distributions and systems with few items, 
but the approaches are of course applicable to different distributions 
and systems with many items as well as with complex configurations. 

(2) Obtain a general expression for the system reliability R in terms 
of the item reliabilities, knowing the system configuration. For each 
item, substitute the result of (1) into the system reliability model 
to obtain a system model as a function of the stress R(s). 

(3) Integrate the above reliability model over the stress pdf, p(s), 
i.e. 


/ R(s) p(s)ds , 
s=0 

where R(s) is the system reliability as given by (2) above. 

The second computation follows the procedure described below: 

(1) Obtain the distribution of the smallest strength in the case of a 
series logic (or largest strength in the case of parallel logic in which 
only one item must operate). For example, the probability that the 
smallest item in n selected at random from a distribution function F(y) has 
a strength less than y is given by 
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P(y s <_ y) = 1 - [1 - F(y)] n 

that is, 1 minus the probability that they (the n strengths) are all 
larger than y . 

(2) The result in (1) is the distribution function for the smallest 
observation and it must be differentiated to obtain the pdf for^ the 
smallest strength y , p^Cy^ • 

(3) The joint pdf of strength y g and stress s is given by 

p 1 (y s ) p( s > 

and it must be integrated over the region y g >. s to obtain the 
probability that the strength is adequate to withstand the 
imposed stress, i.e. 


// P 1 (y s ) P(s)dy g ds. 
y >s 

The examples given below will illustrate these two approaches. 


Example 9-1 

First consider a single element with strength between 80 and 
100 psi and stress between 60 and 85 psi. If the density functions 
are uniform on the respective intervals and the stress and strengths 
are independent the following two-dimensional plot indicates the 
region of inadequate strength. 



Figure 9.1 Region of Inadequate Strength 
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Since the two-dimensional distribution is uniform the probability that 
y exceeds s is given by 0.975, i.e. , 


/ 2? ds io dy 


y>s 


1 85 85 85 

= 1 - 500 L J ds dy = 1 - 500 / [85 - y J dy 

oU y 80 

1 . y 2 85 

~ 1 ~ 500 ^ 85y ~ 2 ^ = 1 - 0.025 = 0.975 . 


80 


Thus the item reliability is 0.975. 

Example 9-2 

Now consider a serial system with n items and suppose that 
each item has the same strength distribution, the items are selected 
at random, and that they are all exposed to the same stress given by 
the stress density function above. Thus the probability that this 
system will be adequate is equivalent to the probability that all 
items are adequate; that is, each of the strengths will exceed the 
stress value. 

Approach 1 ; Now suppose that the stress is considered to be known or fixed 
s q» then the probability that an item selected at random has strength exceeding 
Sq is given by 


P(y > S Q ) 


I 


1 if Sq < 80 


100 - s ( 
20 


if 80 1 s o 1 85 • 


Hence the probability that all n items have strength exceeding is given by 

l j £ - ^ n r\ 

! 


1 if Sq < 80 


p (y > Sq) = 


(100 - Sq ) 1 

20 


if 80 £ Sq < 85. 


The expected value of P n (y > Sq) for the uniform stress distribution is the 
unconditional probability of no failure 


E[P n (y > Sq)] 


f 1 

(lOO - 

| n 1 

80 

/ ! 
80 

l 20 ) 

25 

ds + / 1 

60 

J_ 1 

Uoo - s^ 

n+1 | 

f i ■ 

n+1 ' 

( 20 / 

\ 

N> 

O 

X 

ro 

Ln 


For n = 2 this probability is 0.95416. 


85 4 

1 + 5 . 

80 
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Approach 2 : Consider the second approach as described above for the same 

problem. The probability that the strength of a serial system is adequate is 
equivalent to stating that the minimum strength of n strengths selected at random 
from the strength distribution will exceed the stress value. The probability that 
the smallest value of y^, i = 1, ..., n (say y^^j) l ess than y is given by 

p ( y < y) = l - (probability all values are greater than y) 

= 1 - (1 - F(y)) n 


where 


F(y) is the distribution function for y as shown below. 



Figure 9-2 Strength Distribution Function 
Hence / 0 y < 80 psi 

F(y) = 1 2 ^ (y ■ 80) > 80 1 y s 100 p s1 

'1, y > 100 psi. 

Now the probability density function for the smallest observation is 

p (y ) = n[l - F(y)] p(y) • 

J s 
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Thus the joint density function for y and s is 


p(y o ) P ( s ) dy ds 
s s 

= n[l - F(y)] n 1 p(y) • p(s) dy ds . 

s 

Using the fact that y and s are uniform probability density functions 
p(y) = ^ , p(s) = -y . 

Hence the probability that y g < s (that is, a failure occurs) is given by 

/ M 1 - 2 ^ (y s ' 8 °)] n " 1 * 2? ds dy s 

y<s 

85 i 

■ £, 1,11 - 20 <*. - 80 »”~ 20 ' H 185 “ y s ldy s 

For n = 2 this reduces to 0.04582 and thus the probability that y > s is 0.95416. 

Example 9-3 

Suppose there are three items in parallel and that at least 
one must work (strength exceed stress). Let the strength distribu- 
tions be identical and uniform as given above and let the stress 
distribution be the same as above. 

Approach 1 : Using the first approach we obtain the probability that the 

strength of a single item exceeds a specified stress s^ and then integrate this 
result over the stress distribution to obtain the unconditional probability. The 
probability that for a single element, strength exceeds stress s Q is given by 


1 if s 0 < 80 


P(y > s Q ) = 


100 - s 


0 


20 


if 80 <_ s Q <_ 85. 


The probability that at least one of three exceeds the value s^ is 


1 - P[all three have strength less than s^] 


= 1 - [1 - P(y > s Q )] 3 , 
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and hence the unconditional probability that the system is adequate is 



[1 - P(y > 



25 


ds 


80 
/ 1 
60 


_1_ 

25 


ds 4* 


4 

5 


85 

+ / [i + 

80 



~) l 3 } H ds 

0.999218. 


Approach 2 : Using this approach the density function must first be obtained 

for the largest strength. The probability that the largest of three strengths exceeds 
y is given by 


1 - P(all three strengths are less than y) 


and the probability that the largest is less than or equal to y is 


where 


F(y) 


F 3 (y) 


0 

1 

20 

1 


(y - 80) , 


y < 80 

80 < y <_ 100 . 
y > 100 


Thus by differentiating F 3 (y) the pdf of the largest strength is obtained, i.e. 


p x (y) 


1 , y < 80 

^ (y - 80)), 80 < y < 100 , 
0 , y > 100 


and thus the probability that y exceeds s is given by 


1 



[ 20 (y - 80)] ‘ 


J5 ds ) 


dy 


0.999218. 


9.2.2 Practical Results 

The results of the examples in Sec. 9.2.1 will be used to illustrate the 
error introduced by incorrectly treating probabilistic dependence. Recall that the 
single item reliability from Ex. 9-1 was 0.975. In Ex. 9-2 for two series items, the 
correct approach resulted in R = 0.95416. If the (incorrect) approach was used 
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of treating these as two independent items in series was used then R = 0.975 2 * 0.950525. 
Thus the incorrect approach resulted in an unwarranted pessimistic reliability for a 
series system. Further, in Ex. 9-3 for three parallel items the correct approach 
resulted in R = 0.999218. If the incorrect approach of treating these as three 
independent items had been used, then R = 1 - 0.025 3 = 0.99925. Thus for a parallel 
system the incorrect approach resulted in an unwarranted optimistic reliability. 

Although the magnitude of these errors for these examples is not very large, it 
should be recognized that only few items were considered. The errors in the above 
examples illustrate the results of more extensive analyses in Refs. 48 and 49. These 
references show these results with greater elaboration for certain situations where 
each item is identical and at the same stress: 

(1) Serial System. Obtaining the reliability of each item separately 
and then substituting these into a series system model of multiplying 
item reliabilities will yield pessimistic system reliability predictions. 

(2) Parallel System. Obtaining the reliability of each item separately 
and then substituting this into a logic reliability model will yield 
optimistic system reliability predictions. 

These results have been shown for situations where the stress-strength distributions 
are normal [Ref. 48] and where they are rectangular [Ref. 49]. Some practical 
guidelines gleaned from these results and expanded on in these Refs, are: 

Serial Systems 

(1) Mount items so they experience the same environment, i.e., a compact 
unit. 

(2) Use consectively manufactured items in the same system, i.e., same 
manufacturer and lot. 

(3) Select items with similar failure modes. 

Parallel Systems 

(1) Mount items so they experience different environments, i.e., different 
planes and location. 

(2) Use items in the same system from different manufacturers and lots. 

(3) Select items with different failure modes. 

9.3 Functionally Related Variables 

A class of multi-item bound-crossing reliability prediction problems are 
those where there is no meaningful reliability measure for each item in the system. 

In the multi-item stress-strength problem of Sec. 9.2 (where in the more general 
terminology the item strengths were item performance characteristics and the stress 
was the interface characteristic) it was appropriate to have a reliability measure 
for each item and a multi-item or system reliability measure. The problem being 
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considered here is a more general one where there is not a performance attribute 
for each item, but the performance attributes are only for the system. That is, the 
functional relationships between the system performance attributes and the item 
and interface characteristics are such that any possible variation in any characteris- 
tic can always be compensated for by some possible variation in a different 
characteristic. The problem here is to obtain the distribution of the performance 
attributes from the distributions of the item and interface characteristics. Then 
the bounds are applied to the system performance attributes for the system relia- 
bility prediction. We will not be concerned here in Sec. 9.3 with mixtures of 
this more general problem with those of Secs. 9.1 and 9.2. The reader interested 
in such complexities is referred to Part IV. Some practical problems which have 
been widely treated in reliability analysis are those for performance variation 
analysis of electronic circuits and of systems in general [Ref. 7]. 

The basic procedure for reliability prediction of functionally related 
variables is as follows: 

(1) Select the performance attributes of interest. These most often 
are functional outputs . 

(2) Develop the deterministic mathematical models at nominal conditions 
relating the performance attributes to item and interface characteristics. 

(3) Estimate the variability of the item and interface characteristics. 

For electronic parts these typically reflect the initial (manufacturing) 
variations, aging effects, and the influence of environmental inputs. 

(4) Compute the following: 

a. The expected variability of and possibly the correlation 
between the performance attributes. 

b. Identify sources of performance attributes variability. 

Possible sources include contributions from the linear, non- 
linear, and interaction behavior of the deterministic models, 
and from variations and correlation between the independent 
variables . 

c. Predict the probability of successful performance by 
assigning limits to the expected performance attribute 
variations . 

The more practical benefits are using the results of (4) for identifying designs 
which are susceptible to failure, and for providing redesign guidance. They are 
also useful for comparing alternate design approaches, and for aiding the assignment 
of specification limits. Normally the prediction of the probability of acceptable 
performance that can be obtained from a performance variations analysis is not highly 
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precise because the approach is an approximate one, but more so because of the lack 
of precision in the data on part and interface characteristic behavior. 

A solution to the problem in closed form is almost never possible, but 
mainly provides a better understanding of what an approximate approach is attempt- 
ing. (See the discussion in Section 11 concerning Mode 2 for identification of 
an approach in closed form.) What is usually done in practice is use an approximate 
approach such as the method of moments (sometimes called the propagation of errors). 
Other approaches are identified in Ref. 7. The method of moments approach is 
presented below as an illustration. 

9.3.1 Method of Moments 

In the moments approach the functional relationship is expanded in a Taylor 
series. Higher order terms may be used, although most applications only use the 
linear terms. Measures of location and variability of the item and interface 
characteristics, which are the independent variables, are described by means and 
central moments. The degree of association which might exist between two independent 
variables is described by the correlation coefficient. The mean and central moments 
of the dependent variables are obtained from the application of expected value 
theory, which gives the mean and central moments of the dependent variable as 
functions of terms obtained from the Taylor series expansion and the mean and central 
moments of the independent variables. The distribution of the performance variables 
is then obtained by either assuming a distribution, or by fitting a distribution by 
the method of equating moments, for example. Correlation between the various 
performance attributes can also be obtained by this approach, but this is not 
usually noted or developed in reliability applicatons of this technique. 

For simpler problems, requiring the use of only first order terms, it is 
possible to use this technique without a computer. Conversion of the functional 
model to a Taylor series yields sensitivity and possible interaction terms which 
readily provide information on variability sources. When the problem becomes more 
complex, as an involved functional relationship and high order moments, a computer 
is required. Advantages of this approach are simplicity for easier problems, and 
resultant information on sources of variability. 

Mathematically the method of moments for a single performance attribute is 
as follows: 

If the relationship 

y = yU, , x .... x ) 

i z n 

can be approximated by a linear function 
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c 0 + C 1 X 1 + 


+ C X , 

n n 


it is possible to approximate the distribution of y for certain distributions of 

the variables x^, i = 1, . .., n. For example, if x^ is normally distributed with 

mean u. and standard deviation o, and if the correlation between x. and x,, is r, , 
i i i j ij 

then the distribution of y is approximately normally distributed with mean 


y{y} = c Q + c lVl + ... + c n u n . 


and standard deviation 


cr{y} 


[c 2 a 2 + ... + c 2 a 2 + 2c 1 c 0 a 1 a 0 r 10 + 
11 nn 1212 12 


+ 2c ..ca .a r .. ] 

n-1 n n-1 n n-l,n 


1/2 


Example 9-4 

Model 

The linear amplifier, for which the circuit is shown in 
Fig. 9-3 is used here to illustrate a reliability prediction 
analysis using the method of moments . 


For audio frequency applications, the transistor is adequately 
described by the hybrid or h-parameters . See Ref. 30 for further 
details on the circuit description and the derivation of the mathe- 
matical model. From circuit analysis the model for current gain 
is as follows: 


A. 


l 


R3 


fe 


IL 


R3 + R4 1 + h IL f h .U 0 + h. 

oe 2 (A e) 2 le 

U 1 1 + h U 0 


where 



R1 R2 
R3 + R4 


R3 R4 
R3 + R4 * 


h , h 
ie oe 


h h 
re f e 


Part Characteristics 

The means and standard deviations of the part characteristics 
are contained in Table 9-1. 
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Table 9-1 


Linear Amplifier Circuit Component 
Part Parameters -Means and Standard Deviations 


Parameter 


Mean 


Standard Deviation 


R1 

R2 

R3 

R4 


h 


re 


h 


oe 


h 


ie 


47.05K ohm 
7.03K ohm 
380.9 ohm 
468% 7 ohm 
102 

576 x 10" 6 

556 x 10 6 mhos 

254 


0.97K ohm 
0.17K ohm 
8.54 ohm 
11 . 14 ohm 
11.1 

0.46 x 10" 6 
68.6 x 10 - ® mhos 
24.9 


V 

cc 



470 ohms 


Figure 9-3 Linear Amplifier Circuit 
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The following matrix contains the correlation coefficients r. . between 
pairs of the equivalent circuit transistor parameters. The resistances 
are sampled at random from separate distributions and are uncorrelated 
with each other and with the h-parameters . 



fe 

h 

oe 

ie 

h 

re 

h. 

fe 

1 

0.595 

0.912 

0.165 

h 

oe 


1 

0.608 

0.400 

h, 

ie 

(by symmetry) 

1 

1 


Analysis 

As suggested in the proposed approach one first performs a 
sensitivity analysis and checks the function A^ = A.( ) for 

non-linearity and for interaction. Because the function is essentially 
linear, the first and second moments of the performance can be obtained 
from the linear approximation to the performance, i.e. 


c« + c, h £ + c 0 h , + . . . + c 0 R4 
0 1 fe 2 ie 8 


39.38 + 0.387 h £ + 118.3 h - 0.742 * 10 4 h 

fe re oe 

- 0.00619Ah_, +0.416 x 10“ 5 AR1 + 0.186 x 10“ 3 AR2 

ie 


+ 0 . 0512AR3 - 0.0502AR4. 


Output 

The estimated mean and standard deviation of A. are given by 
£{A } = 39.38 and 

o{A ± } = [ (0 . 387) 2 s 2 {h fg } + ... + (-0.0502) 2 s 2 {R4} + 

+ 2 (0 . 387) (118 . 3) s (h _ } s (h } r{h. , h } + ... 

fe re f e re 

+ 2 (-0 . 742 x 10 4 ) (-0.00619) s {h } s {h. } r {h , h, }] 1/2 
v oe ie oe ie 

= 3.91. 
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Remark 1 , A reliability prediction is obtained by assigning a desired limit and then 
by obtaining a numerical value from any normal distribution table. 

Remark 2 . If the function could not be approximated by a linear function higher order 
moments and/or distributions of the part characteristics would be required. 

Remark 3 . The standard deviations and means used in the above analysis were 
inherent variations in the part characteristics. Variation as a result of operation 
environment, inputs, stresses, loads, and/or aging were not included. The analysis 
would be the same except that the total standard deviations would be larger than the 
above. In addition, correlations between the behavior of the parts characteristics 
may be introduced as a result of changes in a third variable, such as temperature, 
affecting two or more part characteristics. 
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Part IV. Refinements of Prediction Models 


Some material concerning the structuring of reliability prediction models is 
presented in this part. The intent here is to provide insight somewhat beyond the 
current conventional practices. On occasion a hue and cry is raised as to whether 
or not current conventional practices are appropriate. Anyone who has performed 
reliability predictions and who has given serious consideration to the appropriate- 
ness of these predictions has likewise on occasion felt a bit uncomfortable. Yet, to 
many persons it is not obvious how to go beyond conventional practices. 

The material presented in Part IV is directed toward those who are concerned 
with the development of reliability prediction models. A frame of reference is 
presented which will fit together details of certain reliability prediction problems. 
There are strong limitations on the extent to which these notions can be applied to 
real problems, with the main limitation being data. 

To develop an approach to structuring certain features of reliability predic- 
tion models which reflects more detail is a stumbling point. The difficulties may 
eventually turn out to be elementary in hindsight, but documentation providing 
guidance on the type of problem considered in the following sections is rare. 

Remarks will be made freely in the hope that some may be of help in overcoming these 
difficulties. The following questions introduce some possible stumbling points and 
questions of interest. 

(1) What is to be done if the conventional assumption of probabilistic 
independence is not made? What are sources of dependence and how are 
they reflected in structuring the problem? 

(2) What are the features of a failure mode? How are variables treated 
which are probabilistic but which do not have values that always 
cause a failure? 

(3) What is the pertinence of the typical engineering deterministic 
equations used for obtaining performance and stress. 

(4) What is the relationship between degradation or catastrophic 

failure at the source (point of repair) and the manner in which system 
performance will be affected? 

(5) How are the above considerations brought together? 

(6) What are the implications of replies to these questions on real- 
world reliability predictions and on other reliability analyses? 
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Two examples are considered in some detail in Secs. 10 and 11 to introduce the 
notation and to formulate or structure the reliability prediction problem. Sec. 10 
is a discussion of an example which is intended to illustrate features concerning 
catastrophic and degradation failures. A number of related problems are simultaneously 
treated in a different example presented in Sec. 11. The purpose is to structure 
the problems and not to obtain numerical solutions. Next in Seq. 12 the points made 
in Secs. 10 and 11 are expressed in general notation which results in detailed 
reliability prediction models. The above questions are replied to individually in 
Sec. 13 based on the contents of Secs. 10, 11, and 12, serving as concluding remarks 
for Part IV. 

This material is somewhat related to earlier efforts at RTI supported by 
NASA ORQA [Ref. 47] . 
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10. Catastrophic and Degradation Failures 

There are two broad classes of failure modes which are popularly cited; these 
are a catastrophic failure and a degradation (or drift) failure. A degradation 
failure is an unsatisfactory level of a performance attribute, and a catastrophic 
failure is an abrupt change in a performance attribute, usually culminating in no 
meaningful measure of the performance attribute. 

The question here is, "Is there a unique relationship between the classification 
of system failure into catastrophic or degradation and a similar classification of 
the source (point of repair) of system failure?" The answer will be developed by 
considering some failures associated with an electronic transmitter. 

Catastrophic failure at the source: 

(1) A part within a system opens or shorts. The result could be an immediate 
catastrophic failure of an output performance attribute, thus a catastrophic failure; 
or, the result could be a degradation failure of an output performance attribute. 

For example, the heater winding of a temperature control oven opens and the carrier 
frequency of a transmitter drifts. The oven winding open is an illustration where 
the system would not immediately fail, but rather results in an increased probability 
or later system failure. 

(2) An input such as a supply voltage is completely lost. This results in 
the complete loss of all performance attributes. 

Degradation failure (or conditions) at the source: 

(1) An output performance attribute crosses a bound and is considered to 
have failed. Here there is some value of the performance attribute present, but it 
is outside of the desired range. This type of failure may have no single cause, as 
there may be several different parts which could be changed in order to correct the 
failure. There may be several items considered as failures according to the bounds 
on the performance characteristic in each part T s specification, or there may be no 
part considered as having failed according to these criteria. Here there would be a 
functional relationship between the output performance attribute and the characteristics 
of the parts. 

(2) An internal performance attribute crosses a bound, which causes an output 
performance attribute to fail catastrophically. An example is an oscillator ceasing 
to oscillate because of part characteristic value changes, with the result that an 
output performance attribute fails catastrophically. This type of failure is similar 
to the above as it may have no single cause. 

The above examples illustrate that there is no unique correspondence between 
catastrophic and degradation failure modes at the detailed level (source) to that 
at the system output performance attribute level. That is, a catastrophic failure 
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at the point of repair may show up at the system performance level as either a 
catastrophic or degradation failure, and similarly for vice-versa. Further, a degra- 
dation failure may not have any unique point of repair. The examples cited above 
for degradation failures within a system were for cases where there was not a unique 
repair point. It is possible there could be, such as where an output performance 
attribute is only a transformation of a single part characteristic. A more specific 
illustration drawing on the above example discussion is shown as Table 11.1 to 
assist in summarizing these relations . 

Table 11.1 

Illustrating Output-Source and Catastrophic-Degradation 
Failure Mode Relations for a Transmitter 


System output performance 
attribute behavior 

Source of failure within 

the system 

Degradation 

Catastrophic 

Degradation, e.g., carrier 
frequency drift 

Oscillator drifted, 
may not have a unique 
source . 

Open winding of 
temperature control 
oven. 

Catastrophic, e.g., no 
output 

Oscillator ceases to 
oscillate, may not 
have unique source. 

Supply voltage lost. 


Whether or not a failure is catastrophic or degradation will not be a dominating 
consideration in the ensuing discussion. That is, it is not absolutely necessary that 
identification of one or the other failure modes be maintained in the model. This is 
a key point in structuring a detailed reliability prediction model, as there is a 
tendency to carry along too much detail in the notation which culminates in side- 
tracking. Introducing any detailed failure mode at the source in a particular problem 
may utilize, any of several description methods which will be covered in the following 
sections. The method depends on the form of the given information. Of course there 
are certain forms which could prevail, as for certain commodities. The notation which 
will be used does not specifically identify the type of failure as to catastrophic 
or degradation. Of course, the person setting up the problem will have a classification 
in mind for each mode which is introduced. 
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11. A Detailed Prediction Example 

In order to aid the reader in becoming oriented a somewhat realistic problem will 
be considered. Figure 11-1 illustrates an electrical circuit, a regulated voltage 
divider using a zener diode as a reference. The typical electrical notation are first 
shown in capital letters and the equivalent functional notation which will be used 
in Part IV is given in parentheses in small letters. This functional notation will 
be defined and discussed in a more general vein in Sec. 12. 

The notation which appears in Fig . 11-1 plus several additional terms are 
discussed below. 


Z 


&1 » ^2 9 ^3 


E. 

l 


x 


1 


E 


Z 


x 


2 


x 


3 


x 


4 


x 


5 


x 


6 


"01 


02 


k i> 1 


*1 

*2 

1,2,3 


R 2 = y 4 
R 3 = y 5 


I 


2 


W 


Z 


Zener reference diode, assumed to be a constant voltage 
source over the current range of interest, 

Resistors, each with a deterministic temperature relationship, 
Input voltage, 

Zener reference voltage, 

Ambient temperature in °C, 

Resistance of R-^ at 25°C, 

Resistance of at 25°C, 

Resistance of R^ at 25°C, 

Output voltage, 

Output voltage. 

Linear temperature coefficient for the i th resistor, 

Resistance of R 1 at a specific temperature, = x^ + (x^ - 25)k 1 
Similar to y^, y^ = x^ + (x^ - 25 )^^^ 

Similar to y^, y^ = x^ + (x^ - 25)k^, 

Current as designated in Fig. 11-1, 

Current as designated in Fig . 11-1 
Power of zener diode. 
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Ambient temperature of the circuit, 



Power of , and 



Power of . 


°C(x 3 ) 



Figure 11-1 Regulated Voltage Supply 


Note : The symbols in parentheses correspond to those used in the problem 

formulation in this section. 

Some of the variables are continuous and have known or assumed probability density 
functions (pdf T s). These are: 


Known pdf Comment 

pCxjJm^) x^, or E^, has a probability density which is conditional 

on no complete loss of E^, which is designated m^ . 


p(x 2 ,x 3 ) x 2> or E z> has a probabilistic dependence with x^, or °C. 

That is, temperature effect on the reference diode is not 
known deterministically. 


p(x 3 ) 

p(x 4 ) 

p(x 5 , x 6 ) 


The probability density of temperature, which is the 
marginal density of p(x 2 ,x 3 ). 

Nominal value of resistor R^ is independent. 

Resistors R 2 and are of the same nominal value, and 
have a probabilistic dependence; when one is high, the 
other also tends to be high. 
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The circuit is designated as having seven different failure modes: 


j = 1 
j = 2 

j = 3 

j = * 
j = 5 

j = 6 
j - 1 

m . 

3 


Complete loss of E^, 

Drift of (or y^) outside of an acceptable interval 

of values T , 
y l 

Drift of Eq 2 (or y 2 > outside of an acceptable interval 

of values V , 
y 2 

Catastrophic failure of Z, 

Catastrophic failure of R^, 

Catastrophic failure of R 2> 

Catastrohpic failure of R^s an< ^ 

Event that the jth failure mode does not occur. 


Catastrophic failures noted above are those which might occur as influenced by 
the internal stresses. It is known that each item is not initially catastrophically 
failed. Additional known information concerns each catastrophic failure mode, m^ 
though m^ . Relationships between the probability that these failure modes will not 
occur and appropriate environments are known; thus P (m^ | .. .environment (s) ... ) = m j ( • • 
vironment(s) . . . ) are available for m^ through m^. Note that conventional graphs for 
failure rate versus stresses such as those found in MIL-HDBK- 2 1 7 A [Ref. 27] could 
provide this type of relationship. 

The functional notation for deterministic relations will be such as y^ = 

y^( x 3 * x 4)’ P ( m J u 3 ,l V = m 4 < - u 3 ,l V’ and U 1 = u 1 ( x i> x 2’ x 3’ x 4^ * 

The question is how to structure the problem for the probability that none of 
these failure modes occur, where no assumption of independence is made involving the 
features noted above. Each of the failure modes will be treated separately and then 
they are brought together into a composite model. 

Mode m^ 

Mode reliability P(m 1 ) is some known value between 0 and 1. 


.en- 
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Mode 11*2 

Electrical equations are conventional engineering deterministic ones and are 
used for obtaining the performance attribute (or y^) , i.e. 


"01 


= yi 


y 5 X 2 

y 4 + y 5 


y l (x 2* X 3‘ X 5» x 6 ) 


where and y^ are known functions of x^» , and x^ . The zener reference voltage 

x 2 is dependent on x^ the ambient temperature and the joint pdf p(x 2 »x^) has been 
obtained. The mode reliability P(m 9 ) is the probability that y 1 falls within the 


interval of acceptable values, V 


If p(y^) denotes the pdf of y^, then 


P(m ) = / p(y )dy. 

r 

*1 


However a difficult problem is implied by the above integration, that of obtaining 
p(y^) using the functional relationship given above. 

In some few problems a transformation can be defined relating the new variable 
y^ to the original variables x^, x^ , x^ and the distribution of the new variable 
obtained from that of the original variables by means of the Jacobian of the transforma- 
tion. (See Ref. 51 for a description of the method.) 

Usually the above approach is tedious or the integral cannot be obtained in a 
closed form. In such cases, which is the usual situation, one has to use some other 
approach. Often the method of moments is used in which is expanded in a Taylor 

Series using the first order terms (higher order terms may be used but seldom are) and 
obtaining the moments of y^ (first and second order) in terms of the moments of 
x 2* X 3’ x 5 9 an< * x 5 * Hence the distribution of y^ is approximated by the method of 
moments . 

Another procedure is to evaluate the integral 


P(m 2 ) * / P ( x 2 ,x 3) p(x^,x^)dx 2 dx 3 dx^dx 6 

Fy i 

where T determines a region of integration of x 9 , x , x , and x.. This is still 

yi z 3 d o 

difficult but some approximations may be possible and a Monte Carlo simulation could 
be used to obtain the estimate. However the latter approach would require a very 
large number of trials if P(m 2 ) is near 1 and a high precision of the estimate is 
desired. 
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Although each of the above procedures is suitable for estimating the reliability 
of this mode, the latter one will be required for bringing together all modes for a 
single circuit reliability model. This will be treated later. 

Mode m^ 

The performance attribute (° r y 2 ) known: 


"02 


V y 2 


x 2 * 


Mode reliability is defined to be the probability that y ^ 
i. e . 


(or x 0 ) falls within T , 

2 y 2 


P(m 3 ) 


x 3 r 


/ p ( x 2 jX^) dx^dx^ . 
y 2 


Mode m. 


V 


Electrical equations needed here are the conventional ones for the power stress 


W z - u 3 = E Z (I 1 " I 2 ) = X 2 (u l " U 2 > 


where 


h = u i 


E i - E Z x l “ X 2 


2 2 R 2 + R 3 y 4 + y 5 


The mode reliability P(m 4 > is defined as the probability of no catastrophic failure 
of the zener reference diode Z. The relationship of P'(m 4 ) to fixed levels of the 
stresses of temperature °C and power is known: 


P(m 4 |W z ,°C) = P(m 4 |u 3 > u 4 ) = m 4 (u 3 ,u 4 ). 

The power u 3 is a function of x. (all the x's) denoted by u 3 = u 3 (jc) . The ambient 
temperature u 4 is known, °C = x 3 = u 4 , Now x^ has a joint pdf denoted by p(x) . 
Hence the mode reliability is the expected value 
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E[P(m 4 |u 3 ,u 4 )] = / m 4 [x 3 ,u 4 (x)] p(x)dx. 

X 

The reason for obtaining the expected value of the probability of no failure given 
the stress is the fact that some or all of the x’s are probabilistic variables having 
pdf’s. If the failure occurred when the stress exceeded a particular value then the 
probability would be obtained in a manner similar to that of m2 and m^. 

Mode m^ 

No electrical equations are used as the relation of P(m<.) contains only a single 
stress, temperature, °C: 

P(m 5 |°C) = P(m 5 |u^) = ™ 5 (u^) where u^ = x^ . 


Mode reliability is the expected value 


E[P(m 5 |u^)] = / m^x^) p(x q )dx 


5 p 3 ' 


Mode m. 


Electrical equations are for the power stress W . 

z 


W R = u 5 = li R-> = u o y 
K 2 


2 “2 “2 


where U2 is noted in m^. Mode reliability is similar to m^: 

P(m 6 |°C,W ) = P(m 6 |u 4 ,u 5 ) = m 6 (u 4 ,u 5 ) 

where u^ = x^ and u^ = u,- (X2 ,x,_ ,x^) = u^(x/), say where ’ *= X2>x^,x^,} 


E[P(m 6 |u^,u 5 )] = / m^. [x^,u s (x’) ] pCx'Jdx’. 


f 6 LA 3* 5' 


Mode m n 


The development here is similar to , where y^ for become y,. for m^. The 
reliability in functional notation is identical to that for m^ . 
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Composite Reliability 

The reliability of the circuit is the probability that none of the failure modes 

occur ; 

R = P(m^m2 . . . m^) 

Although this probability can be expressed as the product of conditional reliabilities, 

R = P(m 1 ) P (m^ | m^) • • *P (m^ | m & ) , 

this does not aid in the evaluation of the reliability in this example due to the 
commonality of the variables to the various modes. Thus the mode reliabilities 
which were formulated individually in the previous discussion cannot be multiplied 
together to obtain the overall reliability. The circuit reliability is obtained by 
the evaluation of a multiple integral which simultaneously considers the probabilities 
of non-failure of the seven modes. Thus 


R = P(m,) / / 

|i' cr yi 

and 

x'cr 

- 


t (x) n> 5 (x 3 ) m^(x') m^(x') pCxjmpdx 


where all terms are as developed in the preceding discussion. The region of integra- 
tion is a restricted one for only certain values of x 1 , that is, those contained in 

T and T , is there a success. In words the reliability of the circuit is a multiple 
y l y 9 

integral over the acceptable regions of the variables defined by bounds . The integral 
contains the product of the conditional probabilities of non-failure of those modes, 
conditioned on the environment distributions. 

The above reliability expression is rather formidable, indicating that considera- 
tion of dependence resulting from correlation between variables and from the effect 
of the same basic variables on more than one mode reliability yields a complex 
relationship . 

A numerical integration would be tedious and require a computerized solution. 

It would not seem possible to provide a single computer program to treat a very wide 
class of these problems although specific subroutines are available to perform 
numerical integrations. Thus one must use an approximate numerical solution. The 
simplest approach would seem to be a Monte Carlo simulation. Numerical computation is 
discussed later in Sec. 12.2. 
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12. General Model Development 

The features which were contained in the example problem of Sec. 11 are 
brought together and are expressed in general notation. This notation explicitly 
allows much detail (known or starting information), but only part of this detail 
would be expected in a specific problem. Keep in mind that the example of Sec. 11 
was an illustration of this generalization. 

In Sec. 12.1 a prediction model is developed for the series situation where 
occurrence of any failure mode will imply system failure. Numerical solution 
approaches are briefly covered in Sec. 12.2 for the series situation model. Next, 

Sec. 12.3.1 briefly comments on extending the series model to include the explicit 
treatment of time. The final Sec. 12.3.2 comments on the extension of the series 
situation model to a parallel situation where some failure modes can occur but the 
system remains unfailed. 

12.1 Series Situation Model 

A detailed reliability prediction model is developed for the situation where 
the occurrence of any failure mode results in system failure. This will be referred 
to as the series situation model. However, the reader is cautioned not to expect 
that the final composite model will literally be a product of individual probabilities 
of non-failure of each mode. Explicit consideration of mode dependencies results in 
the final composite model being of a different form than a product. 

12.1.1 Notation 

Much of the material in Secs. 10 and 11 pertains to the selection of notation. 
Seeing how to structure the detailed reliability prediction problem considered here 
is aided by an approach which leans toward using common notation for mathematically 
similar descriptions rather than using different symbols for the different physical 
features having common mathematical descriptions. 

As conventionally used: 
t Time , 

y=y(x), w=w(x) Functional relationship, 

x Vector, i.e., x = (x^, x^, . .., x^) , 

P(A) Probability of the event A, 

p(x) Probability density function (pdf) of x, and 

F Bounds (region of acceptable values) , 

Additional notation which is not so conventional and which will be explained 
in the following sections. 

d^ Event that a failure mode which will be referred to as direct- 

fixed does not occur. i = 1, 2, ..., £. The event that this 

failure mode does occur is d. . 
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e Event that a failure mode which will be referred to as direct- 

j 

variable does not occur, j = 1, 2, ..., m. The event that 
this failure mode does occur is e^ . 

b Event that a failure mode which will be referred to as bound- 

k 

crossing does not occur, k = 1, 2, . . . , n. The event that 
this failure mode does occur is b^. 

The d, e, and b will replace the m used in the example problem of Sec. 11 as 

the failure modes illustrated there are now being classified into the three 

types of mathematical descriptions which were used. Additionally, 

x Common variable, 

s 

y Performance attribute, 

v 

u Environment , 

w 

The shorter expression as noted below will be used to indicate the joint 
occurrence of events. 

Conventional Form: P(d^, ^2 y = ^^2^1^ 

••• P(d 1 |d 1 , d 2 , .... d^) ••• P (d^ I d x , d 2> d^) 

l 

Shorter Form: P(d^) = II P(d.|d_') where d' thus indicates appropriate 

i=l 1 

conditional events. 

12.1.2 Common Variables 

There are common variables 21 which influence the probability of certain failure 
modes. The common variables may be deterministic or probabilistic; this discussion 
emphasizes them as probabilistic. The complete probability density p ( x ) of all proba- 
bilistic common variables is given information, including any dependence. No special 
acknowledgement is made in the pU) notation for those common variables which are 
deterministic. Examples in the problem of Sec. 11 of common variables which were 
probabilistic were all x f s, for example, x^, input voltage; x^, ambient temperature; 
and x^ , resistance of at 25°C. Thus common variables could be interface char- 
acteristics such as supply voltage, load, or temperature. Also they could be internal 
characteristics of parts such as resistance or beta. 

The common variables appear in functional relationships for obtaining perform- 
ance attributes, y = y (x) for all v and environments u = u (x) for all w as in 
the conventional engineering equations where all variables are deterministic. Examples 
of performance attribute equations were those for the y s in the problem of Sec. 11, 
and examples of environmental (or stress) equations were those for the u's. Proba- 
bility densities of the performance attributes £ and the environments u will be needed 
and are not usually known. They can be determined (in concept) from known probability 
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densities of the common variables x and the functional relationships y(x) ! s. When 
the probability density of a performance attribute or an environment is known, in- 
cluding any known dependence with common variables, they will be initially classified 
as common variables. Thus if a p(y, x) or a p(u, x.) Is initially known, the y or u 
will be introduced into the composite problem structure as y = x or u = x. This is 
done for two reasons. The first is to avoid additional special notation for what 
are "special cases" in the context of the more complex composite model being formu- 
lated. The second reason is to assist in insuring that some of the more devious 
correlation effects are included, such as the following examples. In the problem of 
Sec. 11 zener reference voltage x 2 was directly a performance attribute in mode 
where y 2 = x 2> and also appeared in several environment- and performance-equations. 

The performance attribute probability densities which are obtained directly from test- 
ing, either by necessity (its y = y(x) not known) or for convenience, would tend to be 
of this nature. The temperature x^ also appeared in several environment- and 
performance-equations, including that for m^ where it was directly an environment 

U A " x 3 * 

12.1.3 Modes 

An undesired event which may or may not occur is a failure mode, e.g., the 
loss of an input voltage, the opening of a resistor, or the drift of a performance 
attribute outside prescribed bounds. There may be several failure modes for a single 
item, e.g., the opening or shorting of the resistor, or a mode may involve more than 
one item, e.g., an output voltage of an amplifier comprised of multiple items. A mode 
may be a feature of other than hardware, e.g., physical shock impulse or a human error. 

Thus, in general, a failure mode can be some undesired feature of a part within 

a system, an input to a system, or an output of a system, including human features. 
Further, what is physically a single item at the smallest level of repair may have 
more than one mode associated with it, and a mode may involve more than one physical 
item. The problem treated is primarily concerned with the non-occurrence of a failure 
mode. The probability that a failure mode will not occur is either known or can be 
determined from functional relationships and probabilistic methods. 

Modes are classified below according to the manner in which they are treated 

in the analysis. System and part failures are commonly thought of as catastrophic 

or degradation, where a catastrophic failure is an abrupt change in some characteristic, 
and a degradation (or drift) failure is a characteristic value outside of some bounds. 

In general, each of the mode description types which are noted below may be for an 
event which would commonly be considered as either a catastrophic or degradation 
failure. That is, there Is not necessarily a unique form of the mathematical des- 
cription for either a catastrophic or a degradation failure. Catastrophic and 
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drift failures are thus just a different method of classifying failure modes from 
that which is developed. The example discussed in Sec. 10 illustrates this point. 

Direct-Fixed Mode . The reliabilities of these modes are fixed values which are 
known. These modes may or may not be dependent on each other, and the dependencies 
are known. An example of this mode in the problem of Sec. 11 is m^, the loss of the 
input voltage. If the possibility had been considered of the parts being initially 
catastrophically failed so that no circuit operation was ever possible, these would 
also have been modes of this type. 

The reliability of a single mode, d, is 

0 < P (d) < 1 . 

The reliability of all direct-fixed modes is 

l 

R = P(d) = n P(d i |d’) (12-1) 

i 

In general these failure modes would be interface events and internal part events 

which preclude the existence of some common variable. Thus in the example cited 

the occurrence of m^, complete loss of the input voltage, will mean that some value 

of the input voltage (and common variable) x , will not be possible, thus pCx^Jm^). 

Also direct-fixed modes could be events completely aside from all common variables. 

Direct-Variable Mode . Reliability of each of these modes is conditional on 

some environment level, where there might be dependence between mode reliabilities 

at fixed environment levels. Each environment is a function of the common variables. 

Examples of direct-variable modes in the problem of Sec. 11 were m^ through m^ which 

were for the non-catas trophic failure of the parts. 

Reliability of a single mode is 

Given: P(e|u) = e(u), u = u (x) for all w, and p(x) 

i — — w w 

Obtain: P(e|x) = e(x) = e[_u(x)] 

R = P(e) = Je(x) p00 d 2L . 

x 

Thus, mode reliability is obtained by an averaging, the expected value operation. 

Figure 12-1 illustrates the development of this type of mode reliability description. 

Reliability for multiple modes is 

Given: ?(eAe\ u) = e (u) , u = u (x) for all w, and p(x) 

j J w w 

Obtain: P(e |e' , x) = e.(x) = e [u (x) ] 

J 3 j 

m 

R = P (e) = / [n e (x ) ] p(x)dx (12-2) 

21 j 
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Environments u in these modes for electronic parts typically would be stresses 

such as current, power, or temperature. The deterministic equations u^ = u^Cx) could 

be conventional electronic equations for obtaining stresses. Note that it is possible 

the mode reliability may be conditional on an environment where the environment is 

also a common variable, or some u = x . This was the situation in the problem of 

w s 

Sec. 11 where the reliability of m^ was conditional only on temperature, and tem- 
perature also appeared in environment and performance euqations. The direct— variable 
modes are where the type of environment information presented in MIL-HDBK-217A 
[Ref. 27] would be applicable, but note that this reference infers explicit treatment 
of time which has not yet been introduced here and it always assumes mode independence 
at fixed environment levels, as did the problem of Sec. 11. 

Bound-Crossing Mode . The reliability of a single bound— crossing mode is the 
probability that a performance attribute y remains within designated bounds . 

Bounds are established either on the basis of judgment or on a more theoretical basis 
such as a condition for oscillation of an electronic circuit oscillator or for a 
s tress— s trength problem. Each performance attribute is a function of the common 
variables x. Examples of this mode in the problem of Sec. 11 were modes m^ and m^ 
for the output voltages. 

Reliability of a single mode: 

Given: y = y (x ) , p(x), r y = < y < r a 

Obtain: R = P(b) = / p(x)djc. 

r 

y 

Thus, the region in x. such that < y(x) < 1^ is the probability of success. 

Fig. 12-2 illustrates the development of this type of mode description. 

Reliability of multiple modes is: 

R = P(b.) = / pWdx , (12-3) 

r 

x 

where 

v = y (x) and T. < y < T for all v. 
y v *v — t v u 

An important point to note for the bound-crossing mode is that treatment of 
this mode does not involve the expected value operation. Rather, the bounds on the 
performance attributes V yields two complementary regions in the common variables 
x, with probability density p(x); values of x in one region will result in failure 
and values of x in the other region will result in acceptable performance. 
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Discussion of computation, i.e., transformation, method of moments, and Monte 
Carlo for obtaining the reliability of this type of mode description was contained in 
mode in the problem of Sec. 11. 

12.1.4 Composite Model 

The three types of failure mode descriptions of Sec. 12.1.3 are brought together 
into a composite series model where the occurrence of any failure mode will mean 
system failure. Consider first the direct-fixed modes. 

J l 

P(d.) =; TI P(d.|d_ T ) where 0 < P(d |d/) < 1 . 

i=i 1 

If there were no dependencies between any of these modes, then the resulting product 

of mode reliabilities would be the simple model which is so widely assumed for re- 

liability of items in serial logic. 

Bring in the direct-variable modes: 

P(d, e) = P (d) P(e|d) 

l m 

= n P(d.|d_') J n P(e | cl, e/ , u) p(x|cl)dx 

i=l 1 x j=l J 

where 

P( e |d_, e.' , u) = (u) , u w = for all w . 

The multiple mode descriptions above are expressed conditionally on other direct- 
variable modes. A reason is that several different modes may apply to the same 
physical item. For example, if a two terminal electronic part has the open and short 
failure modes explicitly treated, then the part can either fail by (open) or 
(short |no open) or vice versa. This possibility was not explicitly treated in the 
problem of Sec. 11. Introduce the bound-crossing modes for the complete model 

R = P(d, e, b) = P(d) P(e, b|d) 

1 m 

= n P(d I d ' ) II n P(e.|d_,e', u) p(x|d)djc (12-4) 

1=1 {x'Hx" c r } j=i J 

where x = (x 1 , x") , y v = y v (x n ), x* do not appear in any y^ = y v (x n ) , and 
To < Y < r for all v , and the supporting information noted above for the 

£ v u 

direct-variable modes still applies. 
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Equation 12-4 is the composite reliability prediction model. This is the 
general functional notation counterpart of the specific problem functional notation 
reliability prediction model which was formulated in Sec. 11. 

12.2 Numerical Calculation 

An approach for numerical calculation of the composite reliability model 
using Monte Carlo simulation is shown in Fig. 12-3. Step (3) in Fig. 12-3 can be 
omitted by using n instead of c in the denominator of step (4) if no estimate is 
wanted of the reliability of bound-crossing failures. It is also possible to obtain 
an estimate of the dispersion of the distribution of reliabilities, although this is 
not shown on Fig. 12-3. 

Another approach for numerical calculation would be to use a discrete approxi- 
mation of the complete region of the common variables x instead of sampling the 
region. Figure 12-4 shows this approach. A grid network would be established cover- 
ing the complete region and resulting in discrete cells. This approach would be useful 
where some of the input information would be obtained directly from testing at the 
nodes of the grid network. A discrete approximation approach would most likely be 
applied to a limited number of common variables. In an experimental application of 
notions similar to these in Part IV to a tilt-stabilization platform, temperature 
and input voltage were considered as common variables [Ref. 24]. A discrete 
approximation approach was used of the region of temperature and input voltage, where 
testing was conducted at each node to obtain input information for a bound-crossing 
failure mode. 

Any realistic application of the concepts in Part IV would utilize a modern 
digital computer. It is not felt that numerical computation would be the most limit- 
ing factor in realistic applications. 
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Figure 12-3 Monte Carlo Simulation for Approximate Numerical Calculation 
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Figure 12-4 Approximate Numerical Calculation Using 
Discrete Approximation of Common Variables 


134 







12.3 Additional Considerations 

Questions naturally arise concerning extensions to explicitly treating 
time and to parallel situations. The essence of the approach has been illus- 
trated thus far, and brief comments are given below on these questions. The 
comments are brief and qualitative because mathematical notation becomes even 
more complex. Spme key features are noted which would be useful to one seriously 
pursuing the problem. That is, one will have to develop the detail and these features 
are noted for guidance. 

12.3.1 Explicitly Treating Time 

As might be expected, explicitly treating time in the series situation 
model is a reasonably straightforward extension of the approach used in Sec. 12.1. 

The concept of common variables and of the three types of mode descriptions remains 
unchanged. Mathematical descriptions of reliability measures as functions of time 
for the direct modes would be as described in Secs. 4.2 and 4.3 and of time-varying 
probability density functions for common variables and performance attributes would 
be as described in Sec. 5.4. Thus, for common variables the time dependence is 
denoted by p(x; t) and for a particular variable x g (t) = x s ^5 ^ for a11 s ’ where pC§.) 

is the pdf of the constants, a_ = (a^, a 2 > •••, a n ) • 

Nothing unique exists about the direct-fixed mode. There would be great 
practical difficulty in obtaining the direct-variable mode reliability descriptions 
conditional on a time-varying environment 

P[e (t) |d, e' , u(t)] . 

This would result from the general situation that a large variety of possible forms 
of u(t) are possible. It is, therefore, difficult to develop tables or other standard 
information for general use with time-varying environments. A situation which is more 
practical is where the direct-variable mode reliabilities are functions of time, but 
the probability densities of the common variables and thus the time varying environ- 
ments are not functions of time, i.e., conventional failure rate graphs of Ref. 27. 

The bound-crossing mode will have the unique feature that there could be a specific 
failure time for each possible value of a (where a is that noted above for the common 
variables). This will have the effect of entering into the composite model as trun- 
cations on the reliability time-functions of the direct-variable modes. Both mono- 
tonic and some non-monotonic performance attribute variations could be treated, as 
they both become first-crossing problems. 
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12.3.2 Parallel Models 

The situation where the occurrence of certain failure modes does not mean 
system failure will be referred to as the parallel one. Before proceding to more 
involved considerations it is pertinent to note that the expected value operation 
as cited in Sec. 9.1 can be applied to conventional reliability models for redundancy 
where the environment is a probabilistic variable. This straightforward approach is 
useful in certain practical problems. Where the complexities of Sec. 12.1 are present, 
branching modeling concepts could be used. First, the series situation model of 12.1 
would be structured where there are no failure modes. For other non-failed system 
states some of the input information may be different than for the system-state where 
there are no failure modes. For instance, some common variables could take on values 
of zero, and performance attribute and environment equations could change. System- 
state change sequences would have to be traced, and a detailed reliability prediction 
model developed for each sequence. Where time is explicitly treated, the time that 
system-state changes will occur is an explicit variable, and time-wise convolutions 
could be used in tracing through the detail. Thus, explicit treatment of time in 
redundancy situations would significantly increase complexity. 
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13 . 


Concluding Remarks for Part IV 

Each number below refers to the corresponding question noted in the introduction 
to Part IV. Replies to these questions are intended to serve as concluding remarks 
to Part IV. The replies take into consideration mainly the material presented in 

Part IV. 

(1) The first question concerns the conventional assumption of probabilistic 
independence in reliability models. The probabilistic dependence treated in Part 
IV results from various sources. Consider first the series model. (a) the most 
‘straightforward dependence occurs in the given conditional probabilities in the direct- 
fixed and the direct-variable modes. (b) the common variables can bring about depend- 
ence among direct-variable modes in addition to that noted in (a) , dependence 
among bound- crossing modes, and dependence between direct-variable and bound-crossing 

modes . 

Also to be noted here is to avoid confusing probabilistic dependence with error 
in structuring the problem. Where there is a deterministic relation y = y(x) between 
a performance attribute and common variables, and if some of the common variables 
were forced to be treated as modes in that some judgment-based bound was put on each 
of these common variables and the functional relationship ignored, then simply the 
erroneous reliability prediction would be obtained. Additional dependence is introduced 
when a general parallel model is developed as features of the series model may be 
conditional on the system state. 

(2) A key feature in structuring the composite reliability prediction models 
of Sec. 12 is recognition of the distinction between failure modes, common variables, 
performance attributes, and environments. This distinction is of the sort which 
tends to be obvious in hind-sight but was not beforehand. A distraction seems to 

be a tendency to want to treat separately different real-world features which are 
really mathematically similar in the sense of structuring the composite reliability 
prediction model. For example, for electrical equipment there is a tendency to separate 
the internal part characteristics from the interface characteristics. 

(3) The typical deterministic equations of engineering have been divided into 
two categories, those concerned with performance attributes and those concerned with 
environments. The performance attribute equations are used for the bound-crossing 
mode. A performance attribute may be either an output of the system or it may be 
some internal performance of the system. The latter is not of interest to the system 
user, but there may be certain bounds within which the internal performance must 
remain or else the output(s) of the system, which are of interest to the user, will 
be affected. The environment equations are used in the direct-variable mode. 
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4) The discussion in Sec. 10 illustrated that there is no 1-to-l correspondence 
between classification of a failure at a point of repair (source) into catastrophic 
or degradation and a similar two-way classification of the manner in which the system 
performance is affected. The models which are developed in Sec. 12 are based on 
a classification system concerning the mathematical manner by which an individual 

failure mode is described, and do not emphasize the classification system of catastrophic 
and degradation failures. 

(5) The composite models developed and presented in Sec. 12 show how the various 
features are brought together. These composite models do not resemble the more familiar 
prediction models which are widely used. Bringing the various failure mode types 

and common variables together for even a simple series situation is shown to be com- 
plex. Note also that the composite model includes many of the single-item reliability 
measures of Part II and the conventional reliability models of Part III. 

(6) The complexity of the composite reliability models in Sec. 12 and the gen- 
eral lack of necessary input data combine to support the current practices of using 
simpler models. It is possible that there may be problems where some features of the 
composite model would offer some return which would be worth the effort. Certain 
relatively simple systems which have high safety implications might warrant more com- 
plex analyses. An example might be relatively simple devices concerned with explosives, 
such as detonation circuits. The need for high reliability might justify the efforts 
necessary to develop the appropriate data. Another possible application area would 

be at a systems level with regard to redundancy. This would be as discussed in 
Sec. 12.3.2 concerning the use of the expected value operation for redundancy where 
some features of the composite model are dropped. In some situations there may be 
some value in using features of the composite model in efforts to achieve balance in 
design with regard to efforts to reduce various failure modes. In such cases little 
emphasis would be given to the absolute numerical value of the reliability prediction 
number, but rather the values would be compared for different design approaches. Also 
note that in a real-world problem, it may be that only a small number of the variables 

present in the problem will require treatment in the depth implied by the detailed 
model . 

Generally speaking, experimental applications and further investigation are 
necessary in order to determine if more complex reliability prediction models along 
these lines have anything to offer in a practical sense. Of course, before such 
investigations can be attempted it is necessary that an approach to structuring the 
problem be developed, and this necessary first step was an objective of the investi- 
gations reported in Part IV. 
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Motivation for pursuing detailed reliability prediction models includes main- 
tainability objectives as well as reliability ones* A detailed reliability model 
might eventually be useful for maintainability improvements concerning automated 
predictive maintenance, test point selection, and repair procedure development. These 
potential maintainability uses would generally require models for parallel con- 
figurations . 
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APPENDIX 


Mathematics of Prediction - Probability 

The calculus of probability plays an important part in the prediction of 
system reliability. The basic definitions and distribution theory (discrete and 
continuous-univariate and multivariate) play a basic role. The Boolean algebra and 
the calculus of probabilities provide the appropriate analytical tools for manipu- 
lating these probabilistic inputs. In the calculation of a probability of system 
behavior there is little choice in the simplifying approaches that can be taken 
except to use some of the reliability bounds and approximations. Even to use these 
techniques requires a formal introduction to the basic methods and a thorough under- 
standing of the assumptions implied in their use. 

In order to make the written material as brief as possible summary tables 
have been prepared to cover specific topics such as continuous variables, Boolean 
algebra, calculus of probabilities, etc. Supporting each of these tables are cited 
references, discussions and examples demonstrating the techniques in the correspond- 
ing table. Appendix references are contained along with all other references in the 
single reference section. 
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A.l Continuous Random Variables and Distributions 

A continuous random variable is typically one that can take on any value in an 
interval. For example, the lifetime of a transistor under a certain set of test 
conditions could be any time greater than zero. For a very large population of 
transistors one would expect the lives to be scattered or distributed over a large 
interval of time. Continuous distribution functions are used to describe such 
statistical behavior. Table A. 1-1 summarizes basic concepts concerning continuous 
random variables and their distributions. A summary of several common distributions 
is presented in Table A. 1-2. Given a density function p(x) the characteristics can 
be evaluated by application of the formulae in Table A. 1-1. 

Central Limit Theorem 

One of the most important results in statistics is the central limit theorem (CLT) 

which states that if x. . x~, .... x are independent random variables alf having the 

1’ 2 n 

same distribution function F(x) with mean y and standard deviation o, then the sum 


s 


l 


i=l 


x . 

l 


is asymptotically Normally distributed with mean np and standard deviation crv'rT, i.e. , 


P(s 


" S 0 ) 



exp{ (s 


np) 2 }ds 


for n sufficiently large. This result is true under very general conditions on F(x); 
if all variables have the same distribution then it is sufficient that the second 
moment of x be finite. A more general form of the CLT and additional discussion of 
the above case appear in Ref. 52.. An important aspect of the theorem is how large 
n must be before the normal approximation applies. Clearly this dependence on n is 
conditioned by the shape of the distribution. Sums of variables having highly 
skewed distributions would tend to Normality more slowly than for those having 
symmetrical or more nearly Normal distributions. In the latter case sums of variables 
with n larger than 25 or 30 are very closely approximated by the Normal distribution. 
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1 . 


Table A. 1-1 

Continuous Random Variables And Distributions 

x is a random variable (r.v.) having density function p(x) and 

£ 

(cumulative) distribution function F(x). 


x 

2. F(x) = /p(t)dt (t is a dummy variable) and 


5* Probability: 


p(x) 

_ dF(x) 
dx 


II 

o 

8 

= 1. 


■ the range R 

over which x 

is defined 

/ p(x)dx 
R 

= 1. 


P(x £ a) 

a 

= /p(x)dx 
*■*00 

= F(a) 

P(a <_ x <_ b) 

b 

= /p(x)dx 
a 

= F(b) - F(a) 


6. Expectation: For any function g(x), 


E[g(x)] = / g(x) p(x)dx. 

R 

7. Mean of x (first moment about the origin): 


E(x) = /xp(x)dx - V. * 
R 


8. Mean square of x (second moment about the origin); 


E(x 2 ) = /x 2 p(x)dx = v ? . 

R 


9. k-th moment of x with respect to the origin: 


E(x^) = /x^p(x)dx = v, . 

R K 


In precise mathematical notation, X is used to denote a random variable, then 
F(x) = P (X <_ x) , and for a continuous variable p(x)dx « P(x <_ X <_ x+dx) . 
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10. Variance of x (second moment about the mean): 


E{ [x-E(x) ] 2 } = o 2 (x) = 


/[x-E(x) ] 2 p(x)dx = y, 
R " 


11. k-th moment of x about the mean: 


E{[x-E(x)] k } = /[x-E(x) ] k p(x)dx = p. 


12. Relationship between the first four moments: 


y o = v o = 1 


u, = 0 


v - v 2 » v. = mean value of x 
2 1 1 


v 3 ' 3 Vl + 2V 1 


V. - + 6v 0 v. - 3v!f. 

4 3 1 2 1 1 


13. Truncated distribution, F^(x) , of F(x) : 


F t (x) = 


| F (x) / 


)/F(T) 


x <_ T 
x > T. 


Example 

Let x be a random variable with density function 

t \ A “AX 

p(x) = Ae , 


X > 0, x > 0. 


This is the well-known Weibull density function with 0 = 1/A and k 

or the negative exponential density function. 


= 1 


Distribution: 


F(x) = fXe 
0 


-A 


- • i r. 


-Ax 


x < 0 
x > 0 


143 



Probability: 

P(1 £ x < 2) 

1 -Ax,, -X ~2A 

- JAe dx = e -e 

1 

or 

F(2) - F(l) 

- (l-e“ 2J ) - (1-e *> - 

-X -2X 

e - e 

Mean : 

E(x) 

- /Ax.^dx - - 

o A 

j, (r(k) = (k-i) 

Variance : 

a 2 (x) 

OO 

= /(x - l/X) 2 Xe ^ X dx = 

0 

1 

u * 

k-th moment 

about the origin: 




\ 

- /x^d-^dx - 

0 A 

• 
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Table A. 1-2 

Continuous Density Functions and Associated Characteristics 
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A. 2 Discrete Random Variables and Distributions 

A discrete random variable is one that takes on a finite or a countably infinite 
number of values. For example, a binomial variable takes on two values corresponding 
to a success or a failure, such as tossing a coin and the occurrence of a head being a 
success. On the other hand, the number of telephone calls on a given line for a 
specified time may be approximated by a Poisson variable for time intervals of 
"constant density". The number of calls might be considered to take on any one of a 

countably infinite number of values, 0, 1, 2, ..., etc. 

Table A. 2-1 summarizes the definitions and notation for the characteristics of 
distributions of discrete random variables. Table A. 2-2 contains some of the common 
discrete distributions and the means and the variances. Ref. 53 contains a complete 
discussion of many discrete random variables and the pertinent characteristics. 


Example 

Suppose that it is desired to obtain the probability of three or fewer 
failures in a time interval of length t where an item upon failure is 
replaced by a new item. Suppose further that the exponential failure 
time distribution is applicable. Let the failure rate be X = 0.01/hour 
and the time be 200 hours. 

From the above information the mean or expected number of failures is 
2 items. Furthermore the probability of x failures is given by the 
Poisson formula and thus for three or fewer failures the probability 
is expressed as 


P(x < 3) 


e ~ 2 2° + e'V + e~V + e~V 


0! 


1 ! 


2 ! 


3! 


0.8569. 
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Table A. 2-1 

Basic Definitions Concerning Discrete Distributions 


c 

4-4 


g 

43 


O 

co 


G 

e 


. 

X 


CO 





CO 

G 


a ) 


CL) 



a) 

XJ 


G • 


i-H 



•H 

G 

G 

tH X) 


43 



44 

*rH 

43 

G G 

G 

G 




a) 

4-J 

> *H 

G 

*H 


X 



r— 1 


u 



•H 

43 

X> 

g cl 

M 

G 


CL 

43 

+4 

G 

• i 0 

O 

> 



•rl 


G 

43 *H 

<4-1 




CO 

CO 




G 


rH 

CO 

4-1 

XJ 

CO CO 

CO 

44 



o 

rH 

g 

CO *H 

44 

G 


iH 

CL 

G 

■u 

O 

G 

M 


X 


CO 

44 

CL 

G 

O 



4-4 

a) 

*H 

a) 

B 

CO 


•H 

O 

G 

0 

rH 44 

o 

•H 


X 



o 

rH *H 

0 

T3 


w 

u 

60 


G G 



g 

G 

CO 


rH 

X» 



43 

•H 

•H 

m m 

G 

G 


II 

S 



G G 

•H 

G 



G 

O 

G 

> -H 

M 

rH 



G 

T— 1 

o 

O 

o 

G 


, — , 

a) 

I— 1 

■H 

M 

44 

> 



o 

4-J 

G O 

o 



rH 

-M 

M-4 

G 

O 

G 

rH 


| 

•rH 

a) 

B 

♦H 0) 

44 

G 


X 

G 

g 

4-J 4J 


M 


N— ' 

•H 

43 

G 

G -H 

60 

60 

G 

X 

4-t 

44 

CO 

g G 

G 

G 

G 

l — 1 




B *H 

•H 

44 

G 

w 

4H 

G 

cm 

G 4-1 

CO 

G 

43 

•H 

M 

O 

CO ^ 

G 

•H 





V 



CM 


o 

If 


8 


o 

ll 



X 

a 


o 

II 


X 

a. 

•H 


O 

8 c—J II 
•H 


X 

cx 


w 



X 

o 

GO-J II 
'H 




4*5 



W 


X 

w' 

cl 



n ii 


4*5 

G. 




G 

G 


44 

43 

G 


*H 

*H 

0 


CO 

M 


G 

G 

44 G 


i-H 

G 

CO O 

G 

43 

nd 

*H »rH 

G 

G 


X) 44 

rH 

•H 


U 

G 

G 

4-* G 

G G 

> 

G 

•H O 

> G 


> 

rH *H 

•H <M 

TS 


♦H 44 

44 

G 

0 

43 O 

G G 

44 X 

O 

G G 

rH O 

O 


43 G 

G -H 

G 4H 

G 

O 4h 

g 44 

CL O 

G 

M 

3 

X 

fX 

CM 

V 

W 

. 

. 



rH 

CM 

CO 



t 4— ( 

o 

G 

o 

G 

CO 

•H 

M 

as 

> 


X 

G 

G 


43 

43 

60 

44 

44 

<4-1 

44 

44 

O 

G 

G 


O 

O 

G 

43 

rO 

G 

rH 

G 

G 

G 

44 

44 

> 

G 

G 


G 

G 

nd 

0 G 

0 

G 

0 *H 

o 

44 

0 60 

0 

O 

•H 

G 

43 M 

43 

a 

44 O 

44 

X 



w 


4*5 


• • • 

md r-v oo 


§ 

Of 


148 


Means and Variances 
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A. 3 Multivariate Distributions (Emphasis on Bivariate Case) 

Consider the situation in which two or more measurements on a part are being 
obtained, e.g. the equivalent h-parameters of a transistor. These two measurements 
would have a joint probability density function (pdf) p(x, y) , say, where x and y 

denote the respective measurements. If the two variables are statistically independ- 
ent then 

p(x, y) = P 1 (x) p 2 (y) , 

and hence the joint density functions can be written down knowing the individual pdf's. 
If the variables are not independent the multivariate density function can be obtained 
by assuming a particular form such as the Normal density function and estimating the 
unknown parameters from available data. 

Most of the properties of bivariate (two-variate) distributions are straight- 
forward generalizations of the univariate distributions given earlier. The new 
concepts are those of conditional and marginal distributions, covariance and correla- 
tion. The generalization of these results to multivariate distributions is easily 
made and one should see Ref. 51 for these results. 

Independent Random Variables . If two variables x and y are independent then 
the covariance of x and y, denoted by Cov(x, y) is 

Cov(x, y) = //(x-E(x)) Pl (x)(y-E(y)) p 2 (y)dxdy = 0. 

However the inverse is not true, i.e. two variables may have zero covariance (or zero 
correlation i.e. p(x, y) = 0) but not be independent. For example, suppose that 
u and v are independent variables, and let x = u+v, y = u-v. Then 

E(xy) = E(u 2 ) - E(v 2 ) = 0, E(y) = 0, and 

Cov(x, y) =0 and p(x, y) = 0. 

However, x and y are dependent. See Ref. 17 for additional examples. Thus the 
correlation is not a general measure of dependence but rather a measure of linear 
dependence of two variables in physical terms; the correlation coefficient is a 
dimensionless covariance. 
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Table A. 3-1 


1 . 


2 . 


3. 


4. 


5. 


6 . 


7. 


8 . 


9. 


10 . 


11 . 


Bivariate Distributions 


Let x, y be a pair of random variables having the joint distribution function 
F(x, y) and density function p(x, y) . 

J7p( x > y) dxdy = 1, where R is the region over which x and y are defined. 

R 

y * 

/ /p(u, v)dudv = F(x, y) . 

-00 — 00 

F(-“>, -<*>) = 0 , F (°° , 00 ) = 1. 


p(x, y) 


3F(x, y) 
9x3y 


d b 

P(a <x<b, c£y^d) = / / p(x, y)dx dy 

c a 


= F(b, d) + F(a, c) - F(a, d) - F(b, c) . 

E(g(x, y)) = // g(x, y) p(x, y)dxdy. 

R 

E(x) = J Jxp (x , y) dxdy . 

R 

If x and y are independent random variables (r.v.'s) then 

p(x j y) = p-j^Cx) p 2 (y) and 

E(x) = /x p (x)dx and E(y) = / y P 2 (y) d y- 

E(xy) = //xy Pi (x) p 2 (y)dxdy 
R 

= E (x) E(y) if x and y are independent r.v.’s. 

E(x - E(x)) 2 = a 2 (x) , E (y - E(y)) 2 = a 2 (y). 
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12 . 


E{(x - E (x) ) (y - E(y) ) } 


Cov(x, y) 


Covariance of x and y 


13. 


14. 


15. 


= // [x - E(x) ] [y - E(y) ] p(x, y)dxdy. 

Correlation of x and y = p{x, y} = Cov(x, y)/a(x) o(y) 
where 

a (x) = [a 2 (x)] 1>/2 and o(y) = [o 2 (y)] 1 ^ 2 . 

Marginal distribution of x is given by 

p (x) = / P(x, y) dy . 

R 

y 


The conditional distribution of y for given x is given by 


p(y |x) 


p(*. y) 

P x (x) 


= P 9 (y) if x and y are independent r.v.'s. 


Example 

Let x and y have a bivariate density function 


p(x, y) = 


exp{- 


2tti/1-c^ 2(1-c 2 ) 


" (x 2 - 2cxy + y 2 ) } , 


First of all note that 


fJ (x, y)dxdy = 1 


since by completion of the square of the exponent 


p(x, y) = //ex p{- (x 2 - 2cxy + c^ 2 ) 

2tt/1-c 2 2 (1-c 2 ) 


+ 2(i-c 2 ) } y 2}dxd y- 

If the variables are transformed as follows: 

u = (x - cy ) / /l-c 2 

v = y 
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then 


2 2 

p(u, v) = //ex p{-( + ^2 ) Uu dv > 

using the fact that the Jacobian of the transformation is given by 


(A. 3-1) 


3u 

3u 


1/ /l-c z 

-c/ /l-c 2 

3x 

8y 

= i/ 



3v 

3v 


0 

1 

3x 

3y 




1 / 


(A-3.1) can be written as the product of the integrals 


M_ _ — 

-i- Je 2 du • fe 2 dv 

/ 2 it -® y^Tt -® 


- /l-c z 


Since each is the integral of the standard Normal density function the above product 
is unity. 

Next the marginal distribution of y is given by 


P 2 (y) = fp( x > y) dx 


exp{ - } 

/2n 


Hence the conditional distribution of x given y is 


^ --to 


(x - cy) 2 }. 


Mean , Variance and Covariance Formulas 

Let x^, x^ , . be n random variables with means U 2 ’ • • • » anc ^ 

variances a 2 , . o 2 respectively and correlations P^ 2 P^ x i> x 2^ ’ ^13 > * 

p . The following results are true independent of the distributions^ of_ the_ 

n-1 , n 

variables . Let y be a linear combination of the variables given by 


c 0 + c 1 x 1 + c 2 x 2 + . . . + VV 
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Then the mean and variance of y are denoted by and a 2 and are given by 

P y = C 0 + Vl + C 2 M 2 + • * * + 7n = c 0 + J l c i^i 

a 2 = c 2 a 2 + c 2 o 2 + . . . -I- c 2 a 2 
y 11 2 2 n n 


*f 2p c c 9 a-a 9 + . . . + 2p . c c a ,o 
12 1 2 1 2 n-l,n n-1 n n-1 n 


or 


<J 2 = \ c 2 a 2 4- 2 £ I c.c.p a. a.. 

y i=l xl 


where is the standard deviation of the i-th variable. The above formulas are true 
in general and one notes that the mean of y does not involve the correlations. 

Now if the variables are uncorrelated (if they are independent as indicated 
previously) the formula for the variance reduces to 


a 2 = c 2 o 2 + c 2 a 2 + . . . + c 2 a 2 . 
y 11 2 2 n n 


Now consider two functions 


y = c n + C-, x n + — + c x 
Oil n n 


w = 4- JLx- 4* ... + Z x , 

Oil n n 


then the covariance of y and w is given by 


Cov{y, w} 


n n 


c iVi + • 


. + c i a 2 + Y J fc.c.a.a.p 
n n n . L , L j i j i H ij 
j=l i=l J J J 


If the functions are not linear it is often possible to use a Taylor series 
expansion of the function f (x) and then apply the mean and variance computations 
to this form. These formulas must be used with care, e.g. by checking the magnitude 
of the errors which may result in using them. Thus if 

y = f ( x ) 


then 


y « f(jt) + l 


3 f 
8x . 

i 




3 z f 


axf 

i 


Ax 2 
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*\l l 


dff 

2 L ., L X 3x.3x. 

i J 


Ax^&Xj , Ax t = x ± - u 1 , 


and hence using only the first order terms 
u - f (li) 


2 „ V / 8 f 

°y 

J i 


) 2 °\ . 


where 


P_ — (p-^j ^2* ^* n )» 


and where 9 f 
9x. 


denotes the evaluation of the derivative at p_ . 


The above results are summarized in the following table. 


Table A. 3-2 


Mean, Variance, and Covariance Formulas 


General Case for Single Function. 


If y 

then p 


c 0 + C 1 X 1 + C 2 X 2 + ••• + c n x n’ 


c 0 + Vl + c 2 y 2 + * • • + 7n 


C 0 + l Vi 


and o 


2 _ 


y c 2 .o 2 . + y y c.c.o a. 

L . i l l l ijij 

i*j 


lP ij 


Variables Uncorrelated. 


c 0 + Vl + 


+ c p 
n n 


c o + l Vi 


= l 


C 2 0 2 . 

Vi 
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General Case for two functions* 


If y 
and w 
then 

Cov(y ,w) 


0 + l Vi 
o + l ‘j-j 


n n 

l l c i £ i°i°i p ii' 

i=l 1 = 1 1 J 1 J 


If x_^ and x are uncorrelated, i.e. p = 0 for i ^ 

j j- j 

Cov(y, w) = l c^o 


General Case for single nonlinear function . 

If y = y(x) , x = (x , . . . , x ) 

in 

then using only a first order approximation 

V* * y(n)> i L = C M -t * U ), vector of 

y in 


means . 


and 


■!(*-* 


3x . 


) 2 a 2 . 


j , then 
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A. 4 Calculus of Probabilities 

Starting with certain definitions and axioms several useful theorems of the 
calculus of probabilities can be derived. Eight such theorems are given in Table A, 4-1 
A major difficulty in reliability literature stems from the notion of statistical 
independence, often called just independence. This notion is basic to many reliability 
calculations, for it is often assumed in these calculations that the failure of one 
item in a system is independent of the failure of all other items in the system. In 
non-probability language, two events are said to be independent if knowledge of the 
outcome of one in no way affects the outcome of the other event. The simplest example 
of independent events is perhaps two tosses of a coin - the result of the first toss 
in no way affects the outcome of the second toss, so the events are independent. 

As a more pertinent example, suppose two amplifiers are selected at random from a 
collection of 100 amplifiers from a production process. Does knowledge concerning 
the value of current gain for the first amplifier alter in any way the probability 
that the current gain for the second amplifier falls in any given interval? If the 
answer is that it does not, then the two observations of current gain are independent. 
In applications these results would usually be treated as independent because if the 
current gain distribution were F(x), the observation of an for the first amplifier 
would not aid in locating the value x 2 for the second one as it presumably could fall 
anywhere on the defined region R for x with the same probability distribution F(x) as 
that for the first observation. 

Consider as another example the measuring of the current gain y c and the voltage 
gain y of a single amplifier. Does knowledge of the value of current gain alter 
information concerning the voltage gain? Chances are that it would because high 
values of y may correspond to higher than average (or lower than average) values of 
v and vice versa. Thus it is normally assumed that such variables may be dependent 
unless data analyses imply otherwise. 

Similar examples can be considered in the reliability prediction area. If the 

event of failure of one of two items in parallel in no way affects the failure 

behavior of the other item the two events are independent. On the other hand if 

failure of one would alter the probability distribution of failure time of the 

second item the two events of failure are dependent. 

In probabilistic terms the above discussion can be summarized as follows. 

A and B are independent if 


P (B | A) = P(B) 


and hence 

P(AB) = P(B|A) P(A) - P(B) P (A) , 
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where P(b|a) is read "the probability of the event B given the event A has occurred." 

Another consideration with respect to independence is that of statistical 
independence and conditional independence which has been denoted as physical 
independence in Ref. 54. 

Two events A and B are said to be conditionally (physically) independent if 
and only if they are statistically independent under environment E^ , that is, 

p(xy|e 1 ) = p(x|e ± ) p(y|e 1 ) 

Physical (conditional) independence does not necessarily imply statistical independence 
of the unconditional events X and Y. In order to compute the reliability of a system 
one usually obtains the conditional probabilities (that is, given the environments) 
and then obtains the weighted average of these conditional probabilities using the 
P(E^) as the weights. In mathematical terms 

P(XY) - l P(XY|E ) P(E. ) , 
i 1 

or 

P(XY) - l P(x|E i ) P(Y|E 1 ) P(E ± ) . 

Frequently in reliability prediction the mission is subdivided into phases in each 
of which the environment is essentially constant throughout the entire phase. Hence 
one uses a formula such as the above. A more complete discussion of the concepts of 
physical and statistical independence appears in Ref. 48 and Ref. 54. 
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Table A. 4 -1 

Calculus of Probabilities 


Definitions and Axioms 


1. 

P(A) = Probability of 

A 

2. 

0 £ P (A) < 1 


3. 

P(Sure Event) = P(I) 

- 1 

4. 

P(An Impossible Event) 

= P(*) « o 

5. 

P(A4-B) = P(A) + P(B) 

- P (AB) 

6. 

P(B A) = Probability 

of B on the hypothesis that A has occurred 

7. 

P(AB) = P (A) P(B|A) 



Theorems 


1. P(A+A) « P(A) + P (A) = 1 

2. P(A) = P(AB) + P(AB) 

3. P(A+B) = 1 - P(AB) 

4. P(A 1 +A 2 + ••• +A n ) = P(A 1 ) + P(A 2 ) + ••• + P(A n ) 

- p( Al A 2 ) - p( Al A 3 ) - ... - PU^) 

+ P(A 1 A 2 A 3 ) + ••• 

+ (-l) n_1 P(A 1 A 2 ••• A n ) 

5. P(A 3 A 2 ••• A n ) = P(Aj_) P(A 2 |A 1 ) P(A 3 |A 1 A 2 ) ••• P(A n |A 1 ••• A^) 

6. If A, , • • * , A are all mutually independent events, then 

1 n 

n 

P (A. A„ ... A ) = n P(A ) 

1 z n i=l 

7. If A^, A^ are pairwise mutually exclusive i.e. A^A^ = (null set) 

for all pairs i, j = 1, n, i l 1 J, then 

n 

P(A+A +-.. + A ) - l P (A . ) . 

1 z i=l 

8. Bayes Rule - Let B 2> ••• be a collection of events which are mutually 

exclusive and exhaustive, i.e. B^ * B^ = <)> for i i 1 j, and B^ + B 2 + ' ' ' = I, 

then 

P(B ± A) P(A|B ± ) P(B i ) 

p(B il A) = P(A) = EPCAl^) P(B ± ) • 
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Consider the system reliability logic diagram shown below where the 
symbols assigned to each element represent the event of success for that 
element. 




Let the associated probabilities be P(A) = P(B^) = p(B^) = 0.95 

P(C) = 0.98, P(D) = 0.90, P(E 1 ) = P(E 2 > = 0.90. 

Let P^ denote the path A, 

P 2 A « b 2 

p 3 C, D 

P 4 C, E v E 2 . 

Then the probability of success is the probability that at least one of the 
paths P^, ..., p^, is "good", that is, 

P(S) = p{p x + p 2 + p 3 + p 4 > 

= P(P 1 } + P{P 2 } + P{P 3 } + P{P 4 ) 

- PfP^} - PfP^} - PfP^} - P{P 2 P 3 } 

- P{P 3 P 4 } - P(P 2 P 4 } 

+ P{ Pl P 2 P 3 } + PCP^} + P( Pl P 3 P 4 } + P{P 2 P 3 P 4 ) 

- P{P 1 P 2 P 3 P 4 ). 


160 









Now 


= PfAB^} = P{A} PfB^} assuming independence of the events A and B^, hence 
P {P } = (0.95) (0.95) = 0.9025 

Similarly the remaining probabilities are obtained. 


P{P 2 > 

0.9025 


P{P 3 } = 

0.8820 


P(P 4 } = 

0.7938 


p { p i p 2 > = 

p{ab 1 ab 2 } = 

P{AB 1 B 2 ) = 0.8574 

p { p i p 3 } = 

PtABjCD} 

0.7960 

p { p 2 p 3 > = 

p(ab 2 cd} 

0.7960 

p {P 2 p A } = 

PtABjCEjE^ = 

0.7164 

P^P. } = 
1 4 

P{AB l CE 1 E 2 } = 

0.7164 

ptp.p. } = 
3 4 

P{CDCE 1 E 2 > = 

0.7144 

p { p i p 2 p 3 > = 

P{AB 1 AB 2 CD} ' 

= P{AB 1 B 2 CD} = 0.7562 

p{p 1 p 2 p 4 } = 

P{AB 1 AB 2 CE 1 E 2 

} = P{AB 1 B 2 CE 1 E 2 } = 0.6806 

P{P 1 P 3 P 4 } = 

p {ab ;l cdce 1 e 2 } 

= P{AB 1 CDE 1 E 2 } = 0.6448 

P{P 2 P 3 P 4 } = 

P{AB 2 CDCE 1 E 2 } 

= P{AB 2 CDE 1 E 2 } = 0.6448 

PtP 1 P 2 P 3 P 4 } “ 

0.6125 



Hence 

p{S) = 0.9981. 

It should be noted that a particular kind of system redundancy is implied for the 
success probability to be computed in the above way. Specifically, independence is 
required of all failure events, which usually implies active redundancy in the system 
A computer program for performing a reliability prediction such as that above is 
described in Vol. II - Computation. 
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A. 5 Boolean Algebra 

In predicting the reliability of a system we are concerned with various events 
such as a performance measure lies between two given values, no failures in an 
interval of time t, less than three defects in a sample of ten items from a lot of 
material, etc. Such events will be devoted by capital letters A, B, C, etc. An 
event is the result of an experiment and can be considered to be a collection of 
possible outcomes of an experiment within the space of all possible outcomes. For 
example, the selection of three or more good items from a lot of five items is an 
event. The space of all possible outcomes contains 2 5 = 32 points in the sample 
space, correspond to all items bad, only one item good (5 points), two items good 

(10 points), etc five items good (1 point). The event of three or more 

good items corresponds to 16 of these 32 sample points. The basic definitions, 
operations, and properties of Boolean Algebra are summarized in Table A. 5-1. This 
summary covers only those introductory topics included in the first few chapters of 

a text on the subject. Ref. 37 gives a complete discussion of Boolean Algebra and 
its applications. 

Example 

Simplify the following expression: 

[ (AB) + C] (A+C) 

Applying the dualization law, 

[(A+B) + C] (AC). 

The distributive law yields 

(ACA) + (ACB) + (ACC) 


or 


4> + (ACB) + (AC) 


which is 


(ACB) + (AC). 


Since ACBCCAC, the above is equivalent to AC. 
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Table A. 5-1 


Boolean Algebra 
(Algebra of Classes) 

Notation, Definitions, and Logical Operations: 

1. A sure event , an event which always occurs when an experiment or observation 
is made, denoted by I. 

2. An impossible event , an event which never occurs as an outcome of an experiment, 
denoted by 4> . 

3. The complementary event or complement of A, is the event that A does not occur, 
denoted by A. 

A. The sw or union of A and B, denoted by A + B or A U B, is the event that at 
least one of A and B occurs. 

5. The product or intersection of A and B, denoted by AB or A H B is the event 
that both A and B occur. 

6. If occurrence of B implies the occurrence of A, then B C A. 

7. If AB = , then A and B are disjoint. 

Let Fq be a family of events which includes I and which is used with respect 
to the sum and product logical operations. Then events belonging to the field 
satisfy the following relations : 

A + AA = AA = A 

A+b = B + A, AB = BA 

(A+B+C) - A + (B+C) , (AB)C = A(BC) 

A(B+C) = AB + AC 
A + A - I, AA = <t> 

A + 1 = I , AI “ A 

A + $ = A , A4> = ^ • 

Dualization Laws: 

A+B = AB 
AB = A + B. 
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